The Statistics API allows rich querying for Triathlon.org results. Whilst it is possible to replicate these queries by combining other API calls the statistics endpoints offered provide quick access to a data store optimized for such analytics.
For example the following API call returns a medal breakdown for Javier Gomez (athlete.id 5695) for the current 12 months. With advanced filtering and grouping available for a multitude of parameters the statistics API can generate a huge array of analysis types. You may see the 'Getting Started' section for further inspiration.
curl --header "apikey: [[app:key]]" "https://api.triathlon.org/v1/statistics/results?analysis=count&group_by=position|event.name&timeframe=this_12_months&filters=athlete.id%2Ceq%2C5695|position%2Clte%2C3"
Data types
To facilitate the multiple analysis types listed below such as average and minimum all times are stored in seconds and will need to be converted on output if you wish to display a different format.
Available Results
Currently only World Triathlon Series and Olympic Games results are available for analysis. All results will be added shortly.
Analysis Types
- count - counts the number of items recorded that meet the criteria that you provide
- count_unique - counts the number of items that have a unique value for a given property
- minimum - finds the minimum numeric value for a given property. All non-numeric values are ignored as part of the analysis
- maximum - finds the maximum numeric value for a given property. All non-numeric values are ignored as part of the analysis
- sum - finds the sum of all numeric values for a given property. All non-numeric values are ignored as part of the analysis
- average - finds the average value for a given property. All non-numeric values are ignored as part of the analysis
- median - finds the median value for a given property. All non-numeric values are ignored as part of the analysis
- percentile - finds the specified percentile value for a given property. All non-numeric values are ignored as part of the analysis
Filters
Filters are used to restrict queries and all filters excluding geo-filters (see below) contain the following properties. The in filter accepts optional additional parameters (see below).
- property_name - see event properties of the chosen endpoint for all available properties and types
- operator - which filter to perform (see list below for available options)
- property_value - the value of the property filter
Here is a list of operators you can use with filters:
- eq - Equal to e.g. filters=athlete.last,eq,Jorgensen
- ne - Not equal to e.g. filters=athlete.last,ne,Brownlee
- lt - Less than e.g. filters=position,lt,10
- lte - Less than or equal to e.g. filters=position,lte,10
- gt - Greater than e.g. filters=position,gt,3
- gte - Greater than or equal to e.g. filters=position,gte,3
- exists - Whether or not a specific property exists on an event record. When using the “exists” operator, the value passed in must be either “true” or “false” e.g. filters=swim.time,exists,false
- in - Whether or not the property value is in a given set of values e.g. filters=athlete_name,in,Jorgensen,Brownlee,Mola
- contains - Whether or not the string property value contains the given set of characters e.g. filters=event.name,contains,Hamburg
- within - Used to select events within a certain radius of the provided geo coordinate (for geo analysis only) e.g. filters=location,within,30,37.77479,-122.42005
Use id properties where available
Athlete's may change names whereas ids are fixed. Favour the use of ids over name properties where possible.
Because not all filter operators make sense for the different property data types, only certain ones are valid for each type.
- string - eq, ne, lt, gt, exists, in, contains, not_contains
- number - eq, ne, lt, lte, gt, gte, exists, in
- boolean - eq, exists, in
- geo coordinates - within
Filters are constructed by generating a comma seperated string of the property_name, operator and property_value. Multiple filters may be applied and be separated by a pipe. To avoid issues when sending queries it is recommended that you url-encode your filter string before your request is made.
Case Sensitive
Filters are case sensitive so london is not the same as London. Beware!
For example the following filter filters for an athlete_id of 11378 and the event name contains the string "London". We show the filter to be passed as well as the raw unencoded filter for clarity.
filters=athlete.id%2Ceq%2C11378%7Cevent_name%2Ccontains%2CLondon
filters=athlete.id,eq,11378|event_name,contains,London
Geo filters allow you to restrict your criteria within a certain radius (in miles). These filters contain five properties:
- property_name - location
- operator - within
- distance - the distance in miles to find events
- latitude - the latitude of the starting point
- longitude - the longitude of the starting point
The following example limits results to 30 miles from Cape Town, South Africa.
filters=location%2Cwithin%2C30%2C37.77479%2C-122.42005
filters=location,within,30,37.77479,-122.42005
There is one limitation to geo filtering, which is that it can’t be used in combination with a group by request.
Filtering by non-existent properties
If you apply a filter that filters on non-existent properties, the filter will simply have no effect as opposed to causing an error. If you get an unexpected result double check your filters!
The in filter accepts multiple property values each simply separated via an additional comma, hence the following filter filters where the athlete last name is Brownlee, Jorgensen or Snowsill. You may pass in as many values as you wish.
filters=athlete.last,in,Brownlee,Jorgensen,Snowsill
Group By
The group_by query parameter may be added to group the results by a specific property. To use group by, simply set the group_by parameter equal to the name of the property by which you want to group. For multiple group_by parameters simply provide a pipe delimited list.
group_by=event.name|athlete.id
Grouping by non-existent properties
If you group_by a non-existent property, the group_by clause will return null for the grouped property rather than failing.
Timeframe
Timeframes are used to restrict the timeframe of the data under analysis and may be specified in two different ways:
Relative Timeframes - a timeframe that is relative to now. (For example: this year.)
Absolute Timeframes - a timeframe that is specified by two points in time provided by the query parameters start_date and end_date
Relative vs Absolute Timeframes
Absolute timeframes are recommended to avoid errors. When both relative and absolute timeframes are provided the absolute timeframe takes precedence.
Relative timeframes may be grouped into two categories: “this” and “previous”. Use “this” when you want to include events happening right up until now. Use “previous” when you only want to get results for complete chunks of time (e.g. the full month or year).
- this_month
- this_year
- this_n_days
- this_n_weeks
- this_n_months
- this_n_years
- previous_month (same as previous_1_month)
- previous_n_days
- previous_n_weeks
- previous_n_months
- previous_n_years
Relative timeframes
Care must be taken when using relative timeframes and when in doubt use absolute timeframes for your analysis. For example if today is June 6th 2015 then this_month would be from June 1 - June 6 2015 inclusive and previous_month would be the May 1 - May 31 2015.
Relative timeframes are useful for quick data extraction i.e. for all results in the calendar year of 2015 simply provide this_year which runs Jan 1-Dec 31 of the current year.
Absolute timeframes simply require the start_date and end_date parameters to be supplied in yyyy-mm-dd format.
timeframe=this_year
start_date=2015-01-01&end_date=2015-12-31
Interval
The interval property specifies how to breakdown timeframes into specified intervals e.g. to find results per year or quadrennial Olympic cycles. Supported intervals include:
- monthly - breaks your timeframe into month-long chunks.
- yearly - breaks your timeframe into year-long chunks.
- every_n_times - breaks your timeframe into chunks according to the specified length e.g. every_4_years
For example if you wished to retrieve Alistair Brownlee's podium distribution for each year of the World Triathlon Series the following unencoded query would achieve this (note the interval=yearly) parameter.
curl --header "apikey: [[app:key]]" "https://api.triathlon.org/v1/statistics/results?group_by=position&filters=athlete.id,eq,7788|position,lte,3&timeframe=this_7_years&interval=yearly"
Intervals require a timeframe to be set
You must set a timeframe to use intervals and the API will return a 400 bad request response if it not set.