ZAPI-756: Search predicates don't work for fs_allowed

Description

fs_allowed seems to not be searchable using predicates via vmapi

sdc-vmapi zone is version release-20160804-20160804T184149Z-g6931dcc

Initially I was searching with a script
'{and: [{ "eq": ["fs_allowed", "*hyprlofs*" ] }, { "eq": [ "status", "running" ]} ]}'
but as a sanity check tried searching via vmapi for things I know exist such as
any entry:
'{ "eq": ["fs_allowed", "*" ] }'
or pcfs specifically
'{ "eq": ["fs_allowed", "*pcfs*" ] }'
I get no results from any of these search predicates.

Whereas if I find an instance grepping via sdc-oneachnode
sdc-oneachnode -a 'grep fs-allowed /etc/zones/*'
...
=== Output from 00000000-0000-0000-0000-002590c0931c (RA116830):
/etc/zones/860ca376-71b9-e9a1-bc66-9b11c5ed7827.xml:<zone name="860ca376-71b9-e9a1-bc66-9b11c5ed7827" zonepath="/zones/860ca376-71b9-e9a1-bc66-9b11c5ed7827" autoboot="true" brand="joyent" ip-type="exclusive" limitpriv="default" debugid="301" fs-allowed="ufs,pcfs,tmpfs">
...
and check that instance specifically in vmapi, the field is definitely there.
[mhicks@headnode (us-east-1) ~/mhicks]$ sdc-vmapi /vms/860ca376-71b9-e9a1-bc66-9b11c5ed7827 | json -Ha fs_allowed
ufs,pcfs,tmpfs


Looking at the VMAPI logs
[2016-12-02T21:37:24.415Z]  INFO: vmapi/api/21512 on 70b63e05-ebc9-462d-b06a-aa96e78d701f: handled: 500 (req_id=83712a90-b8d7-11e6-a733-c34dc69e6574, route=listvms, audit=true, latency=5, _audit=true, req.version=*, req.body="")
    GET /vms?limit=1000&offset=0&predicate=%7B%20%22eq%22%3A%20%5B%22fs_allowed%22%2C%20%22%2Apcfs%2A%22%20%5D%20%7D HTTP/1.1
    host: vmapi.us-east-1.joyent.us
    user-agent: curl/7.42.0
    accept: application/json
    content-type: application/json
    accept-version: *
    --
    HTTP/1.1 500 Internal Server Error
    connection: close
    content-type: application/json
    content-length: 54
    content-md5: fBEkbqc4wlwruhe5T9VhLA==
    date: Fri, 02 Dec 2016 21:37:24 GMT
    server: VMAPI
    x-request-id: 83712a90-b8d7-11e6-a733-c34dc69e6574
    x-response-time: 5
    x-server-name: 70b63e05-ebc9-462d-b06a-aa96e78d701f
    
    {
      "name": "InvalidQueryError",
      "context": {},
      "ase_errors": []
    }
    --
    InvalidQueryError: (fs_allowed=*pcfs*) is an invalid filter
        at buildWhereClause (/opt/smartdc/moray/lib/objects/common.js:585:19)
        at vasync.forEachPipeline.func (/opt/smartdc/moray/lib/control.js:127:13)
        at Object.func (/opt/smartdc/moray/node_modules/vasync/lib/vasync.js:233:12)
        at Object._onImmediate (/opt/smartdc/moray/node_modules/vasync/lib/vasync.js:200:20)
        at processImmediate [as _immediateCallback] (timers.js:330:15)
    --
    req.timers: {
      "handler-0": 46,
      "bunyan": 43,
      "handler-2": 166,
      "parseAccept": 98,
      "readBody": 147,
      "parseBody": 6,
      "parseDate": 5,
      "parseQueryString": 91,
      "_setApp": 6,
      "preFilterVms": 11,
      "listVms": 4839
    }
[2016-12-02T21:37:26.214Z]  INFO: vmapi/api/21512 on 70b63e05-ebc9-462d-b06a-aa96e78d701f: handled: 500 (req_id=848384f0-b8d7-11e6-a733-c34dc69e6574, route=listvms, audit=true, latency=7, _audit=true, req.version=*, req.body="")
    GET /vms?limit=1000&offset=0&predicate=%7B%20%22eq%22%3A%20%5B%22fs_allowed%22%2C%20%22%2A%22%20%5D%20%7D HTTP/1.1
    host: vmapi.us-east-1.joyent.us
    user-agent: curl/7.42.0
    accept: application/json
    content-type: application/json
    accept-version: *
    --
    HTTP/1.1 500 Internal Server Error
    connection: close
    content-type: application/json
    content-length: 49
    content-md5: +HbkcMiAjz1kPCNzlO7DNA==
    date: Fri, 02 Dec 2016 21:37:26 GMT
    server: VMAPI
    x-request-id: 848384f0-b8d7-11e6-a733-c34dc69e6574
    x-response-time: 6
    x-server-name: 70b63e05-ebc9-462d-b06a-aa96e78d701f
    
    {
      "name": "InvalidQueryError",
      "context": {},
      "ase_errors": []
    }
    --
    InvalidQueryError: (fs_allowed=*) is an invalid filter
        at buildWhereClause (/opt/smartdc/moray/lib/objects/common.js:585:19)
        at vasync.forEachPipeline.func (/opt/smartdc/moray/lib/control.js:127:13)
        at Object.func (/opt/smartdc/moray/node_modules/vasync/lib/vasync.js:233:12)
        at Object._onImmediate (/opt/smartdc/moray/node_modules/vasync/lib/vasync.js:200:20)
        at processImmediate [as _immediateCallback] (timers.js:330:15)
    --
    req.timers: {
      "handler-0": 24,
      "bunyan": 40,
      "handler-2": 167,
      "parseAccept": 105,
      "readBody": 258,
      "parseBody": 10,
      "parseDate": 8,
      "parseQueryString": 126,
      "_setApp": 9,
      "preFilterVms": 27,
      "listVms": 6136
    }

It doesn't seem to be the asterisks as
pfexec sdc-vmapi /vms?predicate=$(urlencode '{ "eq": ["state", "*destroy*" ] }')
yields results with "zone_state": "destroyed",


Similarly searching via ldap style queries doesn't work
[mhicks@headnode (us-east-1) ~/mhicks]$ pfexec sdc-vmapi /vms?query=\(fs_allowed=*\)
HTTP/1.1 500 Internal Server Error
Connection: close
Content-Type: application/json
Content-Length: 49
Content-MD5: +HbkcMiAjz1kPCNzlO7DNA==
Date: Fri, 02 Dec 2016 21:57:35 GMT
Server: VMAPI
x-request-id: 55433cf0-b8da-11e6-a733-c34dc69e6574
x-response-time: 6
x-server-name: 70b63e05-ebc9-462d-b06a-aa96e78d701f

{
  "message": "(fs_allowed=*) is an invalid filter"
}

fs_allowed is listed in the VM Object section of
https://github.com/joyent/sdc-vmapi/blob/master/docs/index.md

Comments

Comment by Michael Hicks
Created at 2016-12-13T03:40:06.000Z
@julien.gilli You had asked in jabber:

"did you expect a ListVms request using the predicate '{and: { "eq": ["fs_allowed", "*hyprlofs*" }, { "eq": "status", "running" } ]}' to return some VM objects, but it didn't? The errors you got when using a predicate that mentions only the "fs_allowed" property are expected to return an error, but when using that property in a "composite" predicate, it should filter entries correctly and as expected"

I cooked up a demo version of a script I had been using to make the queries
#!/usr/bin/bash

page=0
limit=1000
waiting_period=10
VMAPI='/opt/smartdc/bin/sdc-vmapi'
JSON='/usr/bin/json'

urlencode() {
    # urlencode <string>

    local length="${#1}"
    for (( i = 0; i < length; i++ )); do
        local c="${1:i:1}"
        case $c in
            [a-zA-Z0-9.~_-]) printf "$c" ;;
            *) printf '%%%02X' "'$c"
        esac
    done
}

declare -a FILTERS=(
    '{and: [{ "eq": ["fs_allowed", "*hyprlofs*" ] }, { "eq": [ "status", "running" ]} ]}'
    '{and: [{ "eq": ["fs_allowed", "*hyprlofs*" ] }, { "eq": [ "status", "stopped" ]} ]}'
    '{ "eq": ["fs_allowed", "*pcfs*" ] }'
    '{and [{ "eq": ["fs_allowed", "*" ]}, { "eq": [ "status", "running" ]} ]}'
    '{and [{ "eq": ["fs_allowed", "*" ]}, { "eq": [ "status", "*" ]} ]}'
    '{and [{ "eq": ["fs_allowed", "ufs"]}, { "eq": [ "status", "running" ]} ]}'
    '{and [{ "eq": ["fs_allowed", "pcfs"]}, { "eq": [ "status", "running" ]} ]}'
    '{and [{ "eq": ["fs_allowed", "tmpfs"]}, { "eq": [ "status", "running" ]} ]}'
    '{and [{ "eq": ["fs_allowed", "*ufs*"]}, { "eq": [ "status", "running" ]} ]}'
)

for FILTER in "${FILTERS[@]}"
do
    echo "using search predicate filter: $FILTER" >&2

    OFFSET=0
    while true
    do
        echo "limit is $limit; page is $page, Offset is $OFFSET"
        declare -a FOO
        readarray -t FOO <<<"$(pfexec $VMAPI /vms?limit=$limit\&offset=$OFFSET\&predicate=$(urlencode "$FILTER")  | $JSON -Ha uuid state zone_state)"

        echo "Instances returned for page $page are: $(( ${#FOO[@]} - 1 ))"

        if [[ "${#FOO[@]}" -eq 1 ]]
        then
            echo "all done"
            break
        else

            let "page += 1"
            OFFSET=$(($page * $limit))
            echo "Pausing for $waiting_period seconds before requesting $limit more entries starting at $OFFSET to try to not blow up vmapi"
            sleep $waiting_period
        fi
    done
done

output:
[mhicks@headnode (us-east-1) ~/mhicks]$ ./filter_demo.sh 
using search predicate filter: {and: [{ "eq": ["fs_allowed", "*hyprlofs*" ] }, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and: [{ "eq": ["fs_allowed", "*hyprlofs*" ] }, { "eq": [ "status", "stopped" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: { "eq": ["fs_allowed", "*pcfs*" ] }
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "*" ]}, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "*" ]}, { "eq": [ "status", "*" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "ufs"]}, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "pcfs"]}, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "tmpfs"]}, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done
using search predicate filter: {and [{ "eq": ["fs_allowed", "*ufs*"]}, { "eq": [ "status", "running" ]} ]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 0
all done

Here are the Zones I found by other means in east1 that have fs_allowed and are running. I think they should match at least some of the filters I have in the demo.
[mhicks@headnode (us-east-1) ~/mhicks]$ sdc-oneachnode -a 'grep fs-allowed /etc/zones/*' | grep ufs
/etc/zones/66cba283-c084-cf6d-925d-e375f8d723bc.xml:<zone name="66cba283-c084-cf6d-925d-e375f8d723bc" zonepath="/zones/66cba283-c084-cf6d-925d-e375f8d723bc" autoboot="true" brand="joyent" ip-type="exclusive" limitpriv="default" debugid="319" fs-allowed="ufs,pcfs,tmpfs">
/etc/zones/630e6a29-d214-49a9-9a9d-a320bdf85524.xml:<zone name="630e6a29-d214-49a9-9a9d-a320bdf85524" zonepath="/zones/630e6a29-d214-49a9-9a9d-a320bdf85524" autoboot="true" brand="joyent" ip-type="exclusive" limitpriv="default" debugid="1" fs-allowed="ufs,pcfs,tmpfs">
/etc/zones/860ca376-71b9-e9a1-bc66-9b11c5ed7827.xml:<zone name="860ca376-71b9-e9a1-bc66-9b11c5ed7827" zonepath="/zones/860ca376-71b9-e9a1-bc66-9b11c5ed7827" autoboot="true" brand="joyent" ip-type="exclusive" limitpriv="default" debugid="301" fs-allowed="ufs,pcfs,tmpfs">
/etc/zones/4b2247d5-19dd-ee15-c25c-ca1f5d3a4369.xml:<zone name="4b2247d5-19dd-ee15-c25c-ca1f5d3a4369" zonepath="/zones/4b2247d5-19dd-ee15-c25c-ca1f5d3a4369" autoboot="true" brand="joyent" ip-type="exclusive" limitpriv="default" debugid="2656" fs-allowed="ufs,pcfs,tmpfs">

[mhicks@headnode (us-east-1) ~/mhicks]$ for i in 66cba283-c084-cf6d-925d-e375f8d723bc 630e6a29-d214-49a9-9a9d-a320bdf85524 860ca376-71b9-e9a1-bc66-9b11c5ed7827 4b2247d5-19dd-ee15-c25c-ca1f5d3a4369; do sdc-vmapi /vms/$i | json -Ha uuid alais state fs_allowed; done
66cba283-c084-cf6d-925d-e375f8d723bc  running ufs,pcfs,tmpfs
630e6a29-d214-49a9-9a9d-a320bdf85524  running ufs,pcfs,tmpfs
860ca376-71b9-e9a1-bc66-9b11c5ed7827  running ufs,pcfs,tmpfs
4b2247d5-19dd-ee15-c25c-ca1f5d3a4369  running ufs,pcfs,tmpfs

Comment by Julien Gilli [X]
Created at 2016-12-14T02:03:08.000Z
Updated at 2016-12-14T02:04:36.000Z
There are two different problems with this issue:

1. The use of the status property which doesn't exist for VMAPI's VM objects. Every occurrence of status should be state.

2. The use of a search predicate (or search filter) that includes a non-indexed field. The problem is that the default page size for a paginated VMAPI request/response is 1000. Since there are more than 1000 VMs in east-1, here's how the following query is performed:

$ sdc-vmapi /vms?predicate=$(urlencode '{"and": [{ "eq": [ "state", "running" ]}, { "eq": ["fs_allowed", "*ufs*" ] } ]}')

A SQL query is generated with a WHERE clause that includes only the indexed field, that is state and not fs_allowed. This SQL query uses a LIMIT of 1000. Once the database returned 1000 records, another filtering process that includes the non-indexed fields (in this case state and fs_allowed) is performed, but only on those 1000 records, not on all VM objects.

As a result, unless some of the first 1000 records returned by the SQL query that includes only indexed fields return an object that can match the complete search predicate, no object is returned in the response.

In this specific case, the solution is to constrain the set of records returned by the SQL query using another constraining indexed property such as owner_uuid so that it's kept under 1000 entries:

sdc-vmapi /vms?predicate=$(urlencode '{"and": [{"eq": [ "owner_uuid", "7b315468-c6be-46dc-b99b-9c1f59224693"]}, { "eq": [ "state", "running" ]}, { "eq": ["fs_allowed", "*ufs*" ] } ]}') | json -H length
3
[jgilli@headnode (us-east-1) ~]$

I believe there's an existing MORAY ticket to track this problem. It could be MORAY-104, but I'm not 100% certain.

@dap: can you confirm which ticket this one should be mark as a duplicate of?

Comment by Michael Hicks
Created at 2016-12-14T06:47:52.000Z
aargh, yeah status!=state

Also one other minor issue, I misspelled alias as alais in my vmapi output pipe to json resulting in the aliases not being printed.

So in an attempt at refining my process with your feedback I found another issue where I was writing the predicates in invalid json forgetting to quote the beginning 'and' and to put a colon after it.

Still having worked through that issue, using the instance I know has fs_allowed, I can grab that instance's owner_uuid
sdc-vmapi  /vms/4b2247d5-19dd-ee15-c25c-ca1f5d3a4369 | json -Ha alias uuid state fs_allowed owner_uuid
maxbuild 4b2247d5-19dd-ee15-c25c-ca1f5d3a4369 running ufs,pcfs,tmpfs 22378d73-1c52-4257-8dbf-92aead70ab38
I can then see via normal parameterized filtering on vmapi calls that there are significantly less than 1000 instances with that owner_uuid in state running
sdc-vmapi  /vms?owner_uuid=22378d73-1c52-4257-8dbf-92aead70ab38\&state=running | json -Ha alias uuid state | wc -l
      35
sdc-vmapi  /vms?owner_uuid=22378d73-1c52-4257-8dbf-92aead70ab38\&state=running | json -Ha alias uuid state fs_allowed | grep -c pcfs
1
so there are 35 instances, 1 of which at least has fs_allowed set to ufs,pcfs,tmpfs.

Filtering on that uuid and state running with predicates I see all 35 entries returned
[mhicks@headnode (us-east-1) ~/mhicks]$ ./filter_demo.sh 
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "22378d73-1c52-4257-8dbf-92aead70ab38"]},{ "eq": [ "state", "running" ]}]}
limit is 1000; page is 0, Offset is 0
Instances returned for page 0 are: 35
Pausing for 10 seconds before requesting 1000 more entries starting at 1000 to try to not blow up vmapi
limit is 1000; page is 1, Offset is 1000
Instances returned for page 1 are: 0
all done

and adding to that same filter the fs_allowed with wildcard matches I see the results I expect
[mhicks@headnode (us-east-1) ~/mhicks]$ ./filter_demo.sh 
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "22378d73-1c52-4257-8dbf-92aead70ab38"]},{ "eq": [ "state", "running" ]},{ "eq": ["fs_allowed", "*pcfs*"]}]}
limit is 1000; page is 0, Offset is 0
Instances count returned for page 0 are: 1
Instances returned for page 0 are: 4b2247d5-19dd-ee15-c25c-ca1f5d3a4369 running running
all done
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "22378d73-1c52-4257-8dbf-92aead70ab38"]},{ "eq": [ "state", "running" ]},{ "eq": ["fs_allowed", "*ufs*"]}]}
limit is 1000; page is 0, Offset is 0
Instances count returned for page 0 are: 1
Instances returned for page 0 are: 4b2247d5-19dd-ee15-c25c-ca1f5d3a4369 running running
all done
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "22378d73-1c52-4257-8dbf-92aead70ab38"]},{ "eq": [ "state", "running" ]},{ "eq": ["fs_allowed", "*tmpfs*"]}]}
limit is 1000; page is 0, Offset is 0
Instances count returned for page 0 are: 1
Instances returned for page 0 are: 4b2247d5-19dd-ee15-c25c-ca1f5d3a4369 running running
all done

So when I repeat that for another owner_uuid that has over 1000 instances for one of the original instances I found with grep I see the bug you are talking about manifested. The first filter doesn't specify state:running and so it gets 1000 instances which are mostly destroyed, and finds that none of those have fs_allowed set and returns no instances whereas if I filter on running instances, it cuts it down to below 1000 first, and then finds the 3 instances that have fs_allowed set
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "7b315468-c6be-46dc-b99b-9c1f59224693"]},{ "eq": ["fs_allowed", "*tmpfs*"]}]}
limit is 1000; page is 0, Offset is 0
Instances count returned for page 0 are: 1
Instances returned for page 0 are: 
all done
using search predicate filter: {"and": [{"eq": [ "owner_uuid", "7b315468-c6be-46dc-b99b-9c1f59224693"]},{ "eq": [ "state", "running" ]},{ "eq": ["fs_allowed", "*tmpfs*"]}]}
limit is 1000; page is 0, Offset is 0
Instances count returned for page 0 are: 3
Instances returned for page 0 are: 860ca376-71b9-e9a1-bc66-9b11c5ed7827 running running 66cba283-c084-cf6d-925d-e375f8d723bc running running 630e6a29-d214-49a9-9a9d-a320bdf85524 running running
Pausing for 10 seconds before requesting 1000 more entries starting at 1000 to try to not blow up vmapi
limit is 1000; page is 1, Offset is 1000
Instances count returned for page 1 are: 1
Instances returned for page 1 are: 
all done

Comment by David Pacheco
Created at 2016-12-14T18:28:55.000Z
I think MORAY-104 makes sense as the root cause of the issue related to pagination with queries on unindexed fields.