Blog

Accessing the News API in Python

12/05/2022

Accessing and analyzing media content is a fascinating part of data analytics. It allows to follow trends of public interest over time or to see how stories evolve (e.g. newslens ). While many media outlets offer APIs, it is cumbersome to collect them individually. News API closes that gap and allows to search and retrieve live articles from all over the web. In this tutorial we will retrieve the...

Making binary annotations less boring

31/07/2020

Introduction For a university project, I’m developing a Music recommendation classifier based on the Spotify API. The idea is to recommend new music to the user, based on songs he personally likes or dislikes and on the musical components of the song (speed, tonality, instrumentality and many more). The preparation of the dataset usually is the most time-consuming part of any machine learning project. This usually consists of gathering...

Nova-DB: Swiss Local Data API

10/10/2019

Accessing datasets in a structured form through an API can often simplify the life of a data analyst - especially if the same data series are used repeatedly. Unfortunately, many public data sources such as the Federal Statistical Office (BFS) do not provide data access through an API ( STAT-TAB makes life a bit easier, but is not fully automated). While opendata.swiss offers a great way to explore available public datasets....

Building a simple sentiment classifier in R using Trump's tweets

12/08/2019

For the past few years, tasks involving text and speech processing have become really hot-trendy. Among the various researches belonging to the fields of Natural Language Processing and Machine Learning, sentiment analysis ranks really high. Sentiment analysis allows identifying and getting subjective information from the source data using data analysis and visualization, ML models for classification, text mining and analysis. This helps to...

Accessing ECB Exchange Rate Data in Python

10/01/2019

In this Jupyter Notebook we will retrieve data from the European Central Bank (ECB). The ECB publishes through the European Open Data Portal, which we discussed in the previous tutorial . Before diving into the code, please take a quick look at the following websites, to get a feel for what we will be dealing with. EU portal: https://data.europa.eu/euodp/en/data/publisher/ecb ECB SDMX 2.1 RESTful web service:...

EU Open Data Portal API: A short guide

08/10/2018

The EU Open Data Portal gives access to open data published by EU institutions, agencies and other bodies. Around 70 EU institutions, bodies or departments use the platform to make over 12,500 datasets available. In this Jupyter Notebook we will retrieve data from open data portal " http://data.europa.eu/euodp/en/home ". The portal is based on the open source project CKAN. CKAN stands for Comprehensive Knowledge Archive...

Fetching data from the opendata.swiss API: A short tutorial in Python

24/08/2018

opendata.swiss is the Swiss authorities’ portal for open data. Currently, 65 governmental organizations (many federal agencies, but also cantonal agencies, SBB and Post) provide access to 7,057 datasets. The portal offers an easily searchable catalogue of available datasets. Manually downloading the datasets can be cumbersome and the retrieval of data through the API can save time. In this Jupyter Notebook we will retrieve...

How to measure employer demand of data science software

01/06/2017

One approach to estimate and track employer demand of data science software is to analyze which skills are asked for in job ads. We did this using job ads on Indeed and showed which data science software skills are most in-demand in Switzerland and worldwide . In this post, we describe the methods these analyses are based on. We worked with R, as it offers convenient packages facilitating the task. Searching for jobs on Indeed...

Blog Categories