Calculating the distribution of attributes from academic entities

Another feature of the academic API is the ability to calculate the distribution of attribute values for a set of paper entities. This can be done by calling the calchistogram API endpoint.

This is a GET request, so we start by creating a query string, as follows:

    string queryString = $"expr={QueryExpression}&attributes=Y,F.FN";

    //queryString += "&model=latest";
    //queryString += "&count=10";
    //queryString += "&offset=0";

The parameters we can specify are the same as with Evaluate, except that we do not have the orderby parameter. For this call, we want to get the year of publication (Y) and the name of the field of study (F.FN).

We make the call to the API without specifying any request bodies, as shown in the following code:

    HistogramResponse response = await _webRequest.MakeRequest<object, 
    HistogramResponse>(HttpMethod.Get, $"calchistogram?{queryString}");

    if (response == null || response.histograms.Length == 0)
        return;

If the call succeeds, we expect a HistogramResponse object in return. This is a data contract, which should contain the data from the JSON response.

A successful request should give the following JSON response (depending on the requested attributes):

    {
        "expr": "And(Composite(AA.AuN=='jaime teevan'),Y>2012)",
        "num_entities": 37,
        "histograms": [
        {
            "attribute": "Y",
            "distinct_values": 3,
            "total_count": 37,
            "histogram": [
            {
                "value": 2014,
                "prob": 1.275e-07,
                "count": 15
            },
            {   
                "value": 2013,
                "prob": 1.184e-07,
                "count": 12
            },
            {  
                "value": 2015,
                "prob": 8.279e-08,
                "count": 10
            }]
        },
        {
            "attribute": "F.FN",
            "distinct_values": 34,
            "total_count": 53,
            "histogram": [
            {
                "value": "crowdsourcing",
                "prob": 7.218e-08,
            "count": 9
        },
        {
            "value": "information retrieval",
            "prob": 4.082e-08,
            "count": 4
        },
        {
            "value": "personalization",
            "prob": 2.384e-08,
            "count": 3
        },
        {
            "value": "mobile search",
            "prob": 2.119e-08,
            "count": 2
        }]
    }] 
}

The response contains the original query expression that we used. It will give us a count of the number of matching entities. An array of histograms will also be present. This will contain an item for each of the attributes we requested. The data for each item is described in the following table:

Data field

Description

attribute

This is the attribute name.

distinct_values

This is the number of distinct values that match the entities for this attribute.

total_count

This is the total number of value instances among the matching entities for this attribute.

histogram

This is an array containing the histogram data for this attribute.

histogram[x].value

This is the value for the current histogram.

histogram[x].prob

This is the probability that matching entities have this attribute value.

histogram[x].count

This is the number of matching entities that have this value.

With a successful response, we loop through the data, presenting it in the UI using the following code:

    StringBuilder sb = new StringBuilder();

    sb.AppendFormat("Totalt number of matching entities: {0}
",
    response.num_entities);

    foreach (Histogram histogram in response.histograms)
    {
        sb.AppendFormat("Attribute: {0}
", histogram.attribute);
        foreach (HistogramY histogramY in histogram.histogram)
        {
            sb.AppendFormat("	Value '{0}' was found {1} times
", histogramY.value,
            histogramY.count);
        }

        sb.Append("
");
    } 
    Results = sb.ToString();

A successful call gives us the following result:

Calculating the distribution of attributes from academic entities

An unsuccessful API call will return an error, containing a response code. The potential response codes are the same as described in the previous section on the Interpret feature.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.221.197.95