Blog

Improve your Psycopg2 executions for PostgreSQL in Python

01/07/2022

I’m sure everybody who worked with Python and a PostgreSQL database is familiar or definitely heard about the psycopg2 library. It is the most popular PostgreSQL database adapter for the Python programming language. In my work, I come in contact with this library every day and execute hundreds of automated statements. As always, I try to improve my code and execution speed, because, in the cloud, time is money. The longer your code runs the...

Accessing the News API in Python

12/05/2022

Accessing and analyzing media content is a fascinating part of data analytics. It allows to follow trends of public interest over time or to see how stories evolve (e.g. newslens ). While many media outlets offer APIs, it is cumbersome to collect them individually. News API closes that gap and allows to search and retrieve live articles from all over the web. In this tutorial we will retrieve the...

Using Azure Functions with Python

08/05/2020

Overview Lately, I worked a lot with the Azure Cloud. Overall I have to say Azure offers a lot but is still not on the same level as its hardest competitors (AWS, Google). One thing that caught my eye is the compatibility of certain programming languages. Azure supports a few different languages (C#, JavaScript, Java, Python, etc.) but the supported features for these languages differ a lot. I think Azure Cloud is really great for...

Support Vector Machines (SVM) in Python

22/04/2020

Support Vector Machine (SVM) is a widely used supervised learning algorithm for classification and regression tasks. It is mostly exploited for classification problems. The points of different classes are separated by a hyperplane, and this hyperplane must be chosen in such a way that the distances from it to the nearest data points on each side should be maximal. Support Vector Machine has some advantages. The first one is that SVM works...

Nova-DB: Swiss Local Data API

10/10/2019

Accessing datasets in a structured form through an API can often simplify the life of a data analyst - especially if the same data series are used repeatedly. Unfortunately, many public data sources such as the Federal Statistical Office (BFS) do not provide data access through an API ( STAT-TAB makes life a bit easier, but is not fully automated). While opendata.swiss offers a great way to explore available public datasets....

Connecting PostgreSQL to your script (Python)

29/07/2019

Introduction Nowadays PostgreSQL is probably one of the most powerful relational databases among the open-source solutions. Its functional capacities are no worse than Oracle’s and definitely way ahead of the MySQL. So if you are working on apps using Python, someday you will face the need of working with databases. Luckily, Python has quite a wide amount of packages that provide an easy way of connecting and using databases. In this...

Sentiment Analysis of Trump Tweets in Python

09/07/2019

The way other people think about one or another product or service has a big impact on our everyday process of making decisions. Earlier, people relied on the opinion of their friends, relatives, or products and services reposts, but the era of the Internet has made significant changes. Today opinions are collected from different people around the world via reviewing e-commerce sites as well as blogs and social nets. To transform gathered...

A guide to Exploratory Data Analysis in Python

20/05/2019

What is Exploratory Data Analysis Exploratory data analysis (EDA) is a powerful tool for a comprehensive study of the available information providing answers to basic data analysis questions. What distinguishes it from traditional analysis based on testing a priori hypothesis is that EDA makes it possible to detect — by using various methods — all potential systematic correlations in the data. Exploratory data...

Top Python Data Visualization Libraries for Data Science

09/04/2019

In the modern world, the information flow which befalls on a person is daunting. This led to a rather abrupt change in the basic principles of data perception. Therefore visualization is becoming the main tool for presenting information. With the help of visualization, information is presented to the audience in a more accessible, clear, visual form. Properly chosen method of visualization can make it possible to structure large data arrays,...

Accessing ECB Exchange Rate Data in Python

10/01/2019

In this Jupyter Notebook we will retrieve data from the European Central Bank (ECB). The ECB publishes through the European Open Data Portal, which we discussed in the previous tutorial . Before diving into the code, please take a quick look at the following websites, to get a feel for what we will be dealing with. EU portal: https://data.europa.eu/euodp/en/data/publisher/ecb ECB SDMX 2.1 RESTful web service:...

Random Forest in Python with scikit-learn

01/12/2018

The random forest algorithm is the combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. It can be applied to different machine learning tasks, in particular, classification and regression. Random Forest uses an ensemble of decision trees as a basis and therefore has all advantages of decision trees, such as high accuracy,...

Ridge and Lasso in Python

12/11/2018

For many machine learning problems with a large number of features or a low number of observations, a linear model tends to overfit and variable selection is tricky. Models that use shrinkage such as Lasso and Ridge can improve the prediction accuracy as they reduce the estimation variance while providing an interpretable final model. In this tutorial, we will examine Ridge and Lasso regressions, compare it to...

Blog Categories