Exploratory data analysis (EDA) is a powerful tool for a comprehensive study of the available information providing answers to basic data analysis questions.
What distinguishes it from traditional analysis based on testing a priori hypothesis is that EDA makes it possible to detect — by using various methods — all potential systematic correlations in the data. Exploratory data analysis is practically unlimited in time and methods allowing to identify curious data fragments and correlations. Therefore, you are able to examine information more deeply and accurately, as well as choose a proper model for further work.
In Python language environment, there is a wide range of libraries that can not only ease but also streamline the process of exploring a dataset. We will use Google Play Store Apps dataset and go through the main tasks of exploration analysis to find out if there are any trends that can facilitate the process of setting and resolving a business problem.
Before we start exploring our data, we must import the dataset and Python libraries needed for further work. We will use pandas library, a very powerful tool for comprehensive data analysis.
import pandas as pd
googleplaystore = pd.read_csv("googleplaystore.csv")
Let's explore the structure of our dataframe by viewing the first and the last 10 rows.
googleplaystore.head(10)
App | Category | Rating | Reviews | Size | Installs | Type | Price | Content Rating | Genres | Last Updated | Current Ver | Android Ver | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Photo Editor & Candy Camera & Grid & ScrapBook | ART_AND_DESIGN | 4.1 | 159 | 19M | 10,000+ | Free | 0 | Everyone | Art & Design | January 7, 2018 | 1.0.0 | 4.0.3 and up |
1 | Coloring book moana | ART_AND_DESIGN | 3.9 | 967 | 14M | 500,000+ | Free | 0 | Everyone | Art & Design;Pretend Play | January 15, 2018 | 2.0.0 | 4.0.3 and up |
2 | U Launcher Lite – FREE Live Cool Themes, Hide ... | ART_AND_DESIGN | 4.7 | 87510 | 8.7M | 5,000,000+ | Free | 0 | Everyone | Art & Design | August 1, 2018 | 1.2.4 | 4.0.3 and up |
3 | Sketch - Draw & Paint | ART_AND_DESIGN | 4.5 | 215644 | 25M | 50,000,000+ | Free | 0 | Teen | Art & Design | June 8, 2018 | Varies with device | 4.2 and up |
4 | Pixel Draw - Number Art Coloring Book | ART_AND_DESIGN | 4.3 | 967 | 2.8M | 100,000+ | Free | 0 | Everyone | Art & Design;Creativity | June 20, 2018 | 1.1 | 4.4 and up |
5 | Paper flowers instructions | ART_AND_DESIGN | 4.4 | 167 | 5.6M | 50,000+ | Free | 0 | Everyone | Art & Design | March 26, 2017 | 1.0 | 2.3 and up |
6 | Smoke Effect Photo Maker - Smoke Editor | ART_AND_DESIGN | 3.8 | 178 | 19M | 50,000+ | Free | 0 | Everyone | Art & Design | April 26, 2018 | 1.1 | 4.0.3 and up |
7 | Infinite Painter | ART_AND_DESIGN | 4.1 | 36815 | 29M | 1,000,000+ | Free | 0 | Everyone | Art & Design | June 14, 2018 | 6.1.61.1 | 4.2 and up |
8 | Garden Coloring Book | ART_AND_DESIGN | 4.4 | 13791 | 33M | 1,000,000+ | Free | 0 | Everyone | Art & Design | September 20, 2017 | 2.9.2 | 3.0 and up |
9 | Kids Paint Free - Drawing Fun | ART_AND_DESIGN | 4.7 | 121 | 3.1M | 10,000+ | Free | 0 | Everyone | Art & Design;Creativity | July 3, 2018 | 2.8 | 4.0.3 and up |
googleplaystore.tail(10)
App | Category | Rating | Reviews | Size | Installs | Type | Price | Content Rating | Genres | Last Updated | Current Ver | Android Ver | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
10831 | payermonstationnement.fr | MAPS_AND_NAVIGATION | NaN | 38 | 9.8M | 5,000+ | Free | 0 | Everyone | Maps & Navigation | June 13, 2018 | 2.0.148.0 | 4.0 and up |
10832 | FR Tides | WEATHER | 3.8 | 1195 | 582k | 100,000+ | Free | 0 | Everyone | Weather | February 16, 2014 | 6.0 | 2.1 and up |
10833 | Chemin (fr) | BOOKS_AND_REFERENCE | 4.8 | 44 | 619k | 1,000+ | Free | 0 | Everyone | Books & Reference | March 23, 2014 | 0.8 | 2.2 and up |
10834 | FR Calculator | FAMILY | 4.0 | 7 | 2.6M | 500+ | Free | 0 | Everyone | Education | June 18, 2017 | 1.0.0 | 4.1 and up |
10835 | FR Forms | BUSINESS | NaN | 0 | 9.6M | 10+ | Free | 0 | Everyone | Business | September 29, 2016 | 1.1.5 | 4.0 and up |
10836 | Sya9a Maroc - FR | FAMILY | 4.5 | 38 | 53M | 5,000+ | Free | 0 | Everyone | Education | July 25, 2017 | 1.48 | 4.1 and up |
10837 | Fr. Mike Schmitz Audio Teachings | FAMILY | 5.0 | 4 | 3.6M | 100+ | Free | 0 | Everyone | Education | July 6, 2018 | 1.0 | 4.1 and up |
10838 | Parkinson Exercices FR | MEDICAL | NaN | 3 | 9.5M | 1,000+ | Free | 0 | Everyone | Medical | January 20, 2017 | 1.0 | 2.2 and up |
10839 | The SCP Foundation DB fr nn5n | BOOKS_AND_REFERENCE | 4.5 | 114 | Varies with device | 1,000+ | Free | 0 | Mature 17+ | Books & Reference | January 19, 2015 | Varies with device | Varies with device |
10840 | iHoroscope - 2018 Daily Horoscope & Astrology | LIFESTYLE | 4.5 | 398307 | 19M | 10,000,000+ | Free | 0 | Everyone | Lifestyle | July 25, 2018 | Varies with device | Varies with device |
We can see that dataframe googleplaystore has such problem as missing values. But for a more complex view on data, let's do a few more things. Firstly, we will use describe() pandas method that will help us to get a statistic summary of numerical columns in our dataset. We can also use info() method to check data types in each column as well as missing values and shape() for retrieving a number of rows and columns in the dataframe.
googleplaystore.describe()
Rating | |
---|---|
count | 9367.000000 |
mean | 4.193338 |
std | 0.537431 |
min | 1.000000 |
25% | 4.000000 |
50% | 4.300000 |
75% | 4.500000 |
max | 19.000000 |
googleplaystore.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10841 entries, 0 to 10840 Data columns (total 13 columns): App 10841 non-null object Category 10841 non-null object Rating 9367 non-null float64 Reviews 10841 non-null object Size 10841 non-null object Installs 10841 non-null object Type 10840 non-null object Price 10841 non-null object Content Rating 10840 non-null object Genres 10841 non-null object Last Updated 10841 non-null object Current Ver 10833 non-null object Android Ver 10838 non-null object dtypes: float64(1), object(12) memory usage: 1.1+ MB
googleplaystore.shape
(10841, 13)
googleplaystore.dtypes
App object Category object Rating float64 Reviews object Size object Installs object Type object Price object Content Rating object Genres object Last Updated object Current Ver object Android Ver object dtype: object
So, what information do we have after these small actions? Firstly, we have some number of apps that are divided into various categories. Secondly, although such columns as, for example, "Reviews" contain numeric data, they have non-numeric type, that can cause some problems while further data processing.
We are also interested in the total amount of apps and available categories in the dataset. To get the exact amount of apps, we will find all the unique values in the corresponding column.
len(googleplaystore["App"].unique())
9660
unique_categories = googleplaystore["Category"].unique()
unique_categories
array(['ART_AND_DESIGN', 'AUTO_AND_VEHICLES', 'BEAUTY', 'BOOKS_AND_REFERENCE', 'BUSINESS', 'COMICS', 'COMMUNICATION', 'DATING', 'EDUCATION', 'ENTERTAINMENT', 'EVENTS', 'FINANCE', 'FOOD_AND_DRINK', 'HEALTH_AND_FITNESS', 'HOUSE_AND_HOME', 'LIBRARIES_AND_DEMO', 'LIFESTYLE', 'GAME', 'FAMILY', 'MEDICAL', 'SOCIAL', 'SHOPPING', 'PHOTOGRAPHY', 'SPORTS', 'TRAVEL_AND_LOCAL', 'TOOLS', 'PERSONALIZATION', 'PRODUCTIVITY', 'PARENTING', 'WEATHER', 'VIDEO_PLAYERS', 'NEWS_AND_MAGAZINES', 'MAPS_AND_NAVIGATION', '1.9'], dtype=object)
Usually, the duplicates of data appear in datasets, and this can aggravate the quality and accuracy of exploration. Plus, such data clogs the dataset, so we need to get rid of it.
googleplaystore.drop_duplicates(keep='first', inplace = True)
googleplaystore.shape
(10358, 13)
For removing rows with duplicates from a dataset, pandas has powerful and customizable method drop_duplicates(), which takes certain parameters needed to be considered while cleaning dataset. "keep=False" means that method will drop all the duplicates found in dataset with keeping only one value. "inplace = True" means that all the manipulations will be done and stored in the dataset we are currently using.
As we can see above, our initial googleplaystore dataset contained 10841 rows. After removing duplicates, the number of rows decreased to 9948.
Another common problem of almost every dataset is columns with missing values. We will explore only the most common ways to clean a dataset from missing values.
Firstly, let's look at the total amount of missing values in every column for each dataset. One of the great things about pandas is that it allows users to combine various operations in a single action, that brings great optimization opportunities and makes the code more compact.
googleplaystore.isnull().sum().sort_values(ascending=False)
Rating 1465 Current Ver 8 Android Ver 3 Content Rating 1 Type 1 Last Updated 0 Genres 0 Price 0 Installs 0 Size 0 Reviews 0 Category 0 App 0 dtype: int64
Now, let's get rid of all the rows with missing values. Although some statistical approaches allow us to impute missing data with some values (like the most common value or mean value), today we will work only with cleared data.
Pandas dropna() method also allows users to set parameters for proper data processing depending on the expected result. Here we stated that program must drop every row that contains any NA values and all the changes will be stored directly in our dataframe.
googleplaystore.dropna(how ='any', inplace = True)
Let's now check the shape of the dataframe after all cleaning manipulations were performed.
googleplaystore.shape
(8886, 13)
If we look closer at our dataset and result of the dtypes method, we would see that such columns like "Reviews", "Size", "Price" and "Installs" should definitely have numeric values. So, let's see what values every column has in order to specify our further manipulations.
googleplaystore.Price.unique()
array(['0', '$4.99', '$3.99', '$6.99', '$7.99', '$5.99', '$2.99', '$3.49', '$1.99', '$9.99', '$7.49', '$0.99', '$9.00', '$5.49', '$10.00', '$24.99', '$11.99', '$79.99', '$16.99', '$14.99', '$29.99', '$12.99', '$2.49', '$10.99', '$1.50', '$19.99', '$15.99', '$33.99', '$39.99', '$3.95', '$4.49', '$1.70', '$8.99', '$1.49', '$3.88', '$399.99', '$17.99', '$400.00', '$3.02', '$1.76', '$4.84', '$4.77', '$1.61', '$2.50', '$1.59', '$6.49', '$1.29', '$299.99', '$379.99', '$37.99', '$18.99', '$389.99', '$8.49', '$1.75', '$14.00', '$2.00', '$3.08', '$2.59', '$19.40', '$3.90', '$4.59', '$15.46', '$3.04', '$13.99', '$4.29', '$3.28', '$4.60', '$1.00', '$2.95', '$2.90', '$1.97', '$2.56', '$1.20'], dtype=object)
googleplaystore.Installs.unique()
array(['10,000+', '500,000+', '5,000,000+', '50,000,000+', '100,000+', '50,000+', '1,000,000+', '10,000,000+', '5,000+', '100,000,000+', '1,000,000,000+', '1,000+', '500,000,000+', '100+', '500+', '10+', '5+', '50+', '1+'], dtype=object)
googleplaystore.Size.unique()
array(['19M', '14M', '8.7M', '25M', '2.8M', '5.6M', '29M', '33M', '3.1M', '28M', '12M', '20M', '21M', '37M', '5.5M', '17M', '39M', '31M', '4.2M', '23M', '6.0M', '6.1M', '4.6M', '9.2M', '5.2M', '11M', '24M', 'Varies with device', '9.4M', '15M', '10M', '1.2M', '26M', '8.0M', '7.9M', '56M', '57M', '35M', '54M', '201k', '3.6M', '5.7M', '8.6M', '2.4M', '27M', '2.7M', '2.5M', '7.0M', '16M', '3.4M', '8.9M', '3.9M', '2.9M', '38M', '32M', '5.4M', '18M', '1.1M', '2.2M', '4.5M', '9.8M', '52M', '9.0M', '6.7M', '30M', '2.6M', '7.1M', '22M', '6.4M', '3.2M', '8.2M', '4.9M', '9.5M', '5.0M', '5.9M', '13M', '73M', '6.8M', '3.5M', '4.0M', '2.3M', '2.1M', '42M', '9.1M', '55M', '23k', '7.3M', '6.5M', '1.5M', '7.5M', '51M', '41M', '48M', '8.5M', '46M', '8.3M', '4.3M', '4.7M', '3.3M', '40M', '7.8M', '8.8M', '6.6M', '5.1M', '61M', '66M', '79k', '8.4M', '3.7M', '118k', '44M', '695k', '1.6M', '6.2M', '53M', '1.4M', '3.0M', '7.2M', '5.8M', '3.8M', '9.6M', '45M', '63M', '49M', '77M', '4.4M', '70M', '9.3M', '8.1M', '36M', '6.9M', '7.4M', '84M', '97M', '2.0M', '1.9M', '1.8M', '5.3M', '47M', '556k', '526k', '76M', '7.6M', '59M', '9.7M', '78M', '72M', '43M', '7.7M', '6.3M', '334k', '93M', '65M', '79M', '100M', '58M', '50M', '68M', '64M', '34M', '67M', '60M', '94M', '9.9M', '232k', '99M', '624k', '95M', '8.5k', '41k', '292k', '80M', '1.7M', '10.0M', '74M', '62M', '69M', '75M', '98M', '85M', '82M', '96M', '87M', '71M', '86M', '91M', '81M', '92M', '83M', '88M', '704k', '862k', '899k', '378k', '4.8M', '266k', '375k', '1.3M', '975k', '980k', '4.1M', '89M', '696k', '544k', '525k', '920k', '779k', '853k', '720k', '713k', '772k', '318k', '58k', '241k', '196k', '857k', '51k', '953k', '865k', '251k', '930k', '540k', '313k', '746k', '203k', '26k', '314k', '239k', '371k', '220k', '730k', '756k', '91k', '293k', '17k', '74k', '14k', '317k', '78k', '924k', '818k', '81k', '939k', '169k', '45k', '965k', '90M', '545k', '61k', '283k', '655k', '714k', '93k', '872k', '121k', '322k', '976k', '206k', '954k', '444k', '717k', '210k', '609k', '308k', '306k', '175k', '350k', '383k', '454k', '1.0M', '70k', '812k', '442k', '842k', '417k', '412k', '459k', '478k', '335k', '782k', '721k', '430k', '429k', '192k', '460k', '728k', '496k', '816k', '414k', '506k', '887k', '613k', '778k', '683k', '592k', '186k', '840k', '647k', '373k', '437k', '598k', '716k', '585k', '982k', '219k', '55k', '323k', '691k', '511k', '951k', '963k', '25k', '554k', '351k', '27k', '82k', '208k', '551k', '29k', '103k', '116k', '153k', '209k', '499k', '173k', '597k', '809k', '122k', '411k', '400k', '801k', '787k', '50k', '643k', '986k', '516k', '837k', '780k', '20k', '498k', '600k', '656k', '221k', '228k', '176k', '34k', '259k', '164k', '458k', '629k', '28k', '288k', '775k', '785k', '636k', '916k', '994k', '309k', '485k', '914k', '903k', '608k', '500k', '54k', '562k', '847k', '948k', '811k', '270k', '48k', '523k', '784k', '280k', '24k', '892k', '154k', '18k', '33k', '860k', '364k', '387k', '626k', '161k', '879k', '39k', '170k', '141k', '160k', '144k', '143k', '190k', '376k', '193k', '473k', '246k', '73k', '253k', '957k', '420k', '72k', '404k', '470k', '226k', '240k', '89k', '234k', '257k', '861k', '467k', '676k', '552k', '582k', '619k'], dtype=object)
First of all, let's get rid of the dollar sign in "Price" column and turn values into numeric type.
googleplaystore['Price'] = googleplaystore['Price'].apply(lambda x: x.replace('$', '') if '$' in str(x) else x) googleplaystore['Price'] = googleplaystore['Price'].apply(lambda x: float(x))
Now, we will work with "Installs" column. We must get rid of plus sign and convert values to numeric.
googleplaystore['Installs'] = googleplaystore['Installs'].apply(lambda x: x.replace('+', '') if '+' in str(x) else x) googleplaystore['Installs'] = googleplaystore['Installs'].apply(lambda x: x.replace(',', '') if ',' in str(x) else x) googleplaystore['Installs'] = googleplaystore['Installs'].apply(lambda x: int(x))
Also, convert "Reviews" column to numeric type.
googleplaystore['Reviews'] = googleplaystore['Reviews'].apply(lambda x: int(x))
Finally, let's work with "Size" column as it needs more complex approach. This column contains various types of data. Among numeric values which can be whether in Mb or Kb, there are null values and strings. Moreover, we need to deal with the difference in values written in Mb and Kb.
googleplaystore['Size'] = googleplaystore['Size'].apply(lambda x: str(x).replace('Varies with device', 'NaN') if 'Varies with device' in str(x) else x) googleplaystore['Size'] = googleplaystore['Size'].apply(lambda x: str(x).replace('M', '') if 'M' in str(x) else x) googleplaystore['Size'] = googleplaystore['Size'].apply(lambda x: str(x).replace(',', '') if 'M' in str(x) else x) googleplaystore['Size'] = googleplaystore['Size'].apply(lambda x: float(str(x).replace('k', '')) / 1000 if 'k' in str(x) else x) googleplaystore['Size'] = googleplaystore['Size'].apply(lambda x: float(x))
Let's call describe() method one more time. As we can see, now we have statistical summary for all the needed columns that contain numeric values.
googleplaystore.describe()
Rating | Reviews | Size | Installs | Price | |
---|---|---|---|---|---|
count | 8886.000000 | 8.886000e+03 | 7418.000000 | 8.886000e+03 | 8886.000000 |
mean | 4.187959 | 4.730928e+05 | 22.760829 | 1.650061e+07 | 0.963526 |
std | 0.522428 | 2.906007e+06 | 23.439210 | 8.640413e+07 | 16.194792 |
min | 1.000000 | 1.000000e+00 | 0.008500 | 1.000000e+00 | 0.000000 |
25% | 4.000000 | 1.640000e+02 | 5.100000 | 1.000000e+04 | 0.000000 |
50% | 4.300000 | 4.723000e+03 | 14.000000 | 5.000000e+05 | 0.000000 |
75% | 4.500000 | 7.131325e+04 | 33.000000 | 5.000000e+06 | 0.000000 |
max | 5.000000 | 7.815831e+07 | 100.000000 | 1.000000e+09 | 400.000000 |
Visualization is probably one of the most useful approaches in data analysis. Sometimes not all the correlations and dependencies can be seen from the tabular data, and therefore various plots and diagrams can help to clearly depict them.
Let's go through the different ways we can explore categories.
One of the fanciest ways to visualize such data is to use WordCloud. With a few lines of code, we can create an illustration that shows what categories have the biggest amount of apps.
import matplotlib.pyplot as plt import wordcloud from wordcloud import WordCloud import seaborn as sns color = sns.color_palette() %matplotlib inline
from plotly import tools from plotly.offline import iplot, init_notebook_mode from IPython.display import Image import plotly.offline as py import plotly.graph_objs as go import plotly.io as pio import numpy as np py.init_notebook_mode()
wc = WordCloud(max_font_size=250,collocations=False, max_words=33,width=1600, height=800,background_color="white").generate(' '.join(googleplaystore['Category'])) plt.figure( figsize=(20,10)) plt.imshow(wc, interpolation="bilinear") plt.axis("off") plt.tight_layout(pad=0) plt.show()
groups = googleplaystore.groupby('Category').filter(lambda x: len(x) > 286).reset_index() array = groups['Rating'].hist(by=groups['Category'], sharex=True, figsize=(20,20))
As we can see, average apps ratings are quite different across the categories.
And what insight will we get, if we explore average rating for all of the apps?
avg_rate_data = go.Figure() avg_rate_data.add_histogram( x = googleplaystore.Rating, xbins = {'start': 1, 'size': 0.1, 'end' :6} ) iplot(avg_rate_data)
img_bytes = pio.to_image(avg_rate_data, format='png', width=1600, height=800, scale=2)
Image(img_bytes)
As we can see, most of the apps clearly hold a rating above 4.0! Actually, quite a lot of apps seem to have 5.0 rating. Let's check how many apps do have the highest possible rating.
googleplaystore.Rating[googleplaystore['Rating'] == 5 ].count()
271
But does any feature from the dataset really affect on the apps' rating? Let's try to figure out how size, amount of installs, reviews, and price correlate between each other and then explore the impact of every feature on the rating.
First of all, let's build a heatmap. For exploring correlations between features, a heatmap is among the best visual tools. The individual values in the data matrix are represented by different colors helping quickly see what features have the most and the least dependencies.
sns.heatmap(googleplaystore.corr(), annot=True, linewidth=0.5)
<matplotlib.axes._subplots.AxesSubplot at 0x11f75fbe0>
A positive correlation of 0.62 exists between the number of reviews and the number of installations, which means that customers tend to download a given app more if it has been reviewed by a larger number of people. This also means that many active users who download the app usually give feedback.
Despite the fact that modern phones and pads have enough memory to deal with various kinds of tasks and store Gigabytes of data, the size of the apps still matters. Let's explore whether this value really affects app rating or not.
To find an answer to this question, we will use scatterplot which is definitely the most common and informant way to see how two variables correlate.
groups = googleplaystore.groupby('Category').filter(lambda x: len(x) >= 50).reset_index()
sns.set_style("whitegrid") ax = sns.jointplot(googleplaystore['Size'], googleplaystore['Rating'])
/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
As we can see, most of the apps with the highest rating have a size between approximately 20Mb and 40Mb.
paid_apps = googleplaystore[googleplaystore.Price>0] p = sns.jointplot( "Price", "Rating", paid_apps)
So, the top-rated apps do not have big prices: only a few apps have a price higher than $20.
sns.set_style('whitegrid') fig, ax = plt.subplots() fig.set_size_inches(15, 8) p = sns.stripplot(x="Price", y="Category", data=googleplaystore, jitter=True, linewidth=1) title = ax.set_title('App pricing trends across categories')
googleplaystore[['Category', 'App']][googleplaystore.Price > 200].groupby([ "Category"], as_index=False).count()
Category | App | |
---|---|---|
0 | FAMILY | 4 |
1 | FINANCE | 6 |
2 | LIFESTYLE | 5 |
For visualizing this answer we will use boxplot, so we can compare the range and distribution of the number of downloads for paid and free apps. Boxplots also help to answer questions like:
trace0 = go.Box( y=np.log10(googleplaystore['Installs'][googleplaystore.Type=='Paid']), name = 'Paid', marker = dict( color = 'rgb(214, 12, 140)', ) ) trace1 = go.Box( y=np.log10(googleplaystore['Installs'][googleplaystore.Type=='Free']), name = 'Free', marker = dict( color = 'rgb(0, 128, 128)', ) ) layout = go.Layout( title = "Paid apps Vs free apps", yaxis= {'title': 'Downloads (log-scaled)'} ) data = [trace0, trace1] iplot({'data': data, 'layout': layout})
As we can see, paid apps are downloaded less frequently than free ones.
Exploratory data analysis is an inherent part of data exploration that helps to get a general knowledge about the dataset you work with as well as find basic conceptions and outlines to get first insights.
In this tutorial we walked through the general approaches for initial data exploration on the example of apps categories and rating columns. However, there are a lot of other interesting dependencies and correlations left within other columns.
The dataset we used is available via the following link: https://www.kaggle.com/lava18/google-play-store-apps/activity