Statistics API Overview

The Statistics API allows rich querying for Triathlon.org results. Whilst it is possible to replicate these queries by combining other API calls the statistics endpoints offered provide quick access to a data store optimized for such analytics.

850

Use the API to find how often these athletes have shared the podium

For example the following API call returns a medal breakdown for Javier Gomez (athlete.id 5695) for the current 12 months. With advanced filtering and grouping available for a multitude of parameters the statistics API can generate a huge array of analysis types. You may see the 'Getting Started' section for further inspiration.

curl --header "apikey: [[app:key]]" "https://api.triathlon.org/v1/statistics/results?analysis=count&group_by=position|event.name&timeframe=this_12_months&filters=athlete.id%2Ceq%2C5695|position%2Clte%2C3"

📘

Data types

To facilitate the multiple analysis types listed below such as average and minimum all times are stored in seconds and will need to be converted on output if you wish to display a different format.

Available Results

Currently only World Triathlon Series and Olympic Games results are available for analysis. All results will be added shortly.

Analysis Types

  • count - counts the number of items recorded that meet the criteria that you provide
  • count_unique - counts the number of items that have a unique value for a given property
  • minimum - finds the minimum numeric value for a given property. All non-numeric values are ignored as part of the analysis
  • maximum - finds the maximum numeric value for a given property. All non-numeric values are ignored as part of the analysis
  • sum - finds the sum of all numeric values for a given property. All non-numeric values are ignored as part of the analysis
  • average - finds the average value for a given property. All non-numeric values are ignored as part of the analysis
  • median - finds the median value for a given property. All non-numeric values are ignored as part of the analysis
  • percentile - finds the specified percentile value for a given property. All non-numeric values are ignored as part of the analysis

Filters

Filters are used to restrict queries and all filters excluding geo-filters (see below) contain the following properties. The in filter accepts optional additional parameters (see below).

  • property_name - see event properties of the chosen endpoint for all available properties and types
  • operator - which filter to perform (see list below for available options)
  • property_value - the value of the property filter

Here is a list of operators you can use with filters:

  • eq - Equal to e.g. filters=athlete.last,eq,Jorgensen
  • ne - Not equal to e.g. filters=athlete.last,ne,Brownlee
  • lt - Less than e.g. filters=position,lt,10
  • lte - Less than or equal to e.g. filters=position,lte,10
  • gt - Greater than e.g. filters=position,gt,3
  • gte - Greater than or equal to e.g. filters=position,gte,3
  • exists - Whether or not a specific property exists on an event record. When using the “exists” operator, the value passed in must be either “true” or “false” e.g. filters=swim.time,exists,false
  • in - Whether or not the property value is in a given set of values e.g. filters=athlete_name,in,Jorgensen,Brownlee,Mola
  • contains - Whether or not the string property value contains the given set of characters e.g. filters=event.name,contains,Hamburg
  • within - Used to select events within a certain radius of the provided geo coordinate (for geo analysis only) e.g. filters=location,within,30,37.77479,-122.42005

📘

Use id properties where available

Athlete's may change names whereas ids are fixed. Favour the use of ids over name properties where possible.

Because not all filter operators make sense for the different property data types, only certain ones are valid for each type.

  • string - eq, ne, lt, gt, exists, in, contains, not_contains
  • number - eq, ne, lt, lte, gt, gte, exists, in
  • boolean - eq, exists, in
  • geo coordinates - within

Filters are constructed by generating a comma seperated string of the property_name, operator and property_value. Multiple filters may be applied and be separated by a pipe. To avoid issues when sending queries it is recommended that you url-encode your filter string before your request is made.

🚧

Case Sensitive

Filters are case sensitive so london is not the same as London. Beware!

For example the following filter filters for an athlete_id of 11378 and the event name contains the string "London". We show the filter to be passed as well as the raw unencoded filter for clarity.

filters=athlete.id%2Ceq%2C11378%7Cevent_name%2Ccontains%2CLondon
filters=athlete.id,eq,11378|event_name,contains,London

Geo filters allow you to restrict your criteria within a certain radius (in miles). These filters contain five properties:

  • property_name - location
  • operator - within
  • distance - the distance in miles to find events
  • latitude - the latitude of the starting point
  • longitude - the longitude of the starting point

The following example limits results to 30 miles from Cape Town, South Africa.

filters=location%2Cwithin%2C30%2C37.77479%2C-122.42005
filters=location,within,30,37.77479,-122.42005

There is one limitation to geo filtering, which is that it can’t be used in combination with a group by request.

📘

Filtering by non-existent properties

If you apply a filter that filters on non-existent properties, the filter will simply have no effect as opposed to causing an error. If you get an unexpected result double check your filters!

The in filter accepts multiple property values each simply separated via an additional comma, hence the following filter filters where the athlete last name is Brownlee, Jorgensen or Snowsill. You may pass in as many values as you wish.

filters=athlete.last,in,Brownlee,Jorgensen,Snowsill

Group By

The group_by query parameter may be added to group the results by a specific property. To use group by, simply set the group_by parameter equal to the name of the property by which you want to group. For multiple group_by parameters simply provide a pipe delimited list.

group_by=event.name|athlete.id

📘

Grouping by non-existent properties

If you group_by a non-existent property, the group_by clause will return null for the grouped property rather than failing.

Timeframe

Timeframes are used to restrict the timeframe of the data under analysis and may be specified in two different ways:

Relative Timeframes - a timeframe that is relative to now. (For example: this year.)
Absolute Timeframes - a timeframe that is specified by two points in time provided by the query parameters start_date and end_date

📘

Relative vs Absolute Timeframes

Absolute timeframes are recommended to avoid errors. When both relative and absolute timeframes are provided the absolute timeframe takes precedence.

Relative timeframes may be grouped into two categories: “this” and “previous”. Use “this” when you want to include events happening right up until now. Use “previous” when you only want to get results for complete chunks of time (e.g. the full month or year).

  • this_month
  • this_year
  • this_n_days
  • this_n_weeks
  • this_n_months
  • this_n_years
  • previous_month (same as previous_1_month)
  • previous_n_days
  • previous_n_weeks
  • previous_n_months
  • previous_n_years

🚧

Relative timeframes

Care must be taken when using relative timeframes and when in doubt use absolute timeframes for your analysis. For example if today is June 6th 2015 then this_month would be from June 1 - June 6 2015 inclusive and previous_month would be the May 1 - May 31 2015.

Relative timeframes are useful for quick data extraction i.e. for all results in the calendar year of 2015 simply provide this_year which runs Jan 1-Dec 31 of the current year.

Absolute timeframes simply require the start_date and end_date parameters to be supplied in yyyy-mm-dd format.

timeframe=this_year
start_date=2015-01-01&end_date=2015-12-31

Interval

The interval property specifies how to breakdown timeframes into specified intervals e.g. to find results per year or quadrennial Olympic cycles. Supported intervals include:

  • monthly - breaks your timeframe into month-long chunks.
  • yearly - breaks your timeframe into year-long chunks.
  • every_n_times - breaks your timeframe into chunks according to the specified length e.g. every_4_years

For example if you wished to retrieve Alistair Brownlee's podium distribution for each year of the World Triathlon Series the following unencoded query would achieve this (note the interval=yearly) parameter.

curl --header "apikey: [[app:key]]" "https://api.triathlon.org/v1/statistics/results?group_by=position&filters=athlete.id,eq,7788|position,lte,3&timeframe=this_7_years&interval=yearly"

🚧

Intervals require a timeframe to be set

You must set a timeframe to use intervals and the API will return a 400 bad request response if it not set.