UCLappA WebSite

Customer Rating Reactions Can Be Predicted Purely Using App Features

by F. Sarro, M. Harman, Y. Jia, Y. Zhang

Abstract. In this paper we provide empirical evidence that the rating that an app attracts can be accurately predicted from the features it offers. Our results, based on an analysis of 11,537 apps from the Samsung Android and BlackBerry World app stores, indicate that the rating of 89% of these apps can be predicted with 100% accuracy. Our prediction model is built by using feature and rating information from the existing apps offered in the App Store and it yields highly accurate rating predictions, using only a few (11-12) existing apps for case-based prediction. These findings may have important implications for requirements engineering in app stores: They indicate that app developers may be able to obtain (very accurate) assessments of the customer reaction to their proposed feature sets (requirements), thereby providing new opportunities to support the requirements elicitation process for app developers.

Details

This work has been accepted at RE 2018.
Download: full paper.
Data: To request the data used for this study contact Federica Sarro.

Investigating the Relationship between Price, Rating, and Popularity in the BlackBerry World App Store

by M. Harman, Y. Jia, W. Martin, F. Sarro, Y. Zhang

Abstract. Context: App stores provide a software development space and a market place that are both different from those to which we have become accustomed for traditional software development: The granularity is finer and there is a far greater source of information available for research and analysis. Information is available on price, customer rating and, through the data mining approach presented in this paper, the features claimed by app developers. These attributes make app stores ideal for empirical software engineering analysis. Objective: This paper1 exploits App Store Analysis to understand the rich interplay between app customers and their developers. Method: We use data mining to extract app descriptions, price, rating, and popularity information from the Blackberry World App Store, and natural lan- guage processing to elicit each apps’ claimed features from its description. Results: The findings reveal that there are strong correlations between customer rating and popularity (rank of app downloads). We found evidence for a mild correlation between app price and the number of features claimed for the app and also found that higher priced features tended to be lower rated by their users. We also found that free apps have significantly (p-value < 0.001) higher ratings than non-free apps, with a moderately high effect size (Aˆ12 = 0.68). All data from our experiments and analysis are made available on-line to support further investigations.

Details

This work has been published in the Journal of Information and Software Technology, IST 2017.
Download: full paper.
Data: To request the data used for this study contact Federica Sarro.

Exploring the Effects of Ad Schemes on the Performance Cost of Mobile Phones

by C. Gao, J. Zeng, F. Sarro, M. R. Lyu, I. King

Abstract. Advertising is an important revenue source for mobile app development, especially for free apps. However, ads also carry costs to users. Displaying ads can interfere user experience, and lead to less user retention and reduced earnings ultimately. Although there are recent studies devoted to directly mitigating ad costs, for example, by reducing the battery or memory consumed, comprehensive analysis on ad embedded schemes (e.g., ad sizes and ad providers) has rarely been conducted. In this paper, we focus on analyzing three types of performance cost, i.e., cost of memory/CPU, traffic, and battery. We explore 12 ad schemes used in 104 popular Android apps and compare their performance consumption. We show that the performance costs of the ad schemes we analyzed are significantly di erent. We also summarize the ad schemes that would generate low resource cost to users. Our summary is endorsed by 37 experienced app developers we surveyed.

Details

This work has been accepted at A-Mobile 2018, co-located with ASE 2018.
Download: full paper.

Causal Impact Analysis for App Releases in Google Play

by W. Martin, F. Sarro, M. Harman,

Abstract. App developers would like to understand the impact of their own and their competitors’ software releases. To address this we introduce Causal Impact Release Analysis for app stores, and our tool, CIRA, that implements this analysis. We mined 38,858 popular Google Play apps, over a period of 12 months. For these apps, we identified 26,339 releases for which there was adequate prior and posterior time series data to facilitate causal impact analysis. We found that 33% of these releases caused a statistically significant change in user ratings. We use our approach to reveal important characteristics that distinguish causal significance in the Google Play store. To explore the actionability of causal impact analysis, we elicited the opinions of 52 developers: 75% concurred with the causal assessment, of which x% claimed that their company would consider changing their app release str

Details

This work has been accepted at FSE 2016.
Download: full paper.
Data: To request the data used for this study contact William Martin.

Clustering Mobile Apps Based on Mined Textual Descriptions

by A. A. Al-Subaihin, F. Sarro, S. Black, L. Capra, M. Harman, Y. Jia, Y. Zhang

Abstract. Context: Categorising software systems according to their functionality yields many benefits to both users and devel- opers. Objective: In order to uncover the latent cluster- ing of mobile apps in app stores, we propose a novel tech- nique that measures app similarity based on claimed be- haviour. Method: Features are extracted using information retrieval augmented with ontological analysis and used as attributes to characterise apps. These attributes are then used to cluster the apps using agglomerative hierarchical clustering. We empirically evaluate our approach on 17,877 apps mined from the BlackBerry and Google app stores in 2014. Results: The results show that our approach dramat- ically improves the existing categorisation quality for both Blackberry (from 0.02 to 0.41 on average) and Google (from 0.03 to 0.21 on average) stores. We also find a strong Spear- man rank correlation (ρ = 0.96 for Google and ρ = 0.99 for BlackBerry) between the number of apps and the ideal gran- ularity within each category, indicating that ideal granular- ity increases with category size, as expected. Conclusions: Current categorisation in the app stores studied do not ex- hibit a good classification quality in terms of the claimed feature space. However, a better quality can be achieved using a good feature extraction technique and a traditional clustering method.

Details

This work has been accepted at ESEM 2016.
Download: full paper.
Data: The data used in this study is available here.

Mobile App and App Store Analysis, Testing and Optimisation (Keynote)

by A. A. Al-Subaihin, M. Harman, Y. Jia, W. Martin, F. Sarro, Y. Zhang

Abstract. This talk presents results on analysis and testing of mo- bile apps and app stores, reviewing the work of the UCL App Analysis Group (UCLappA) on App Store Mining and Analysis. The talk also covers the work of the UCL CREST centre on Genetic Improvement, applicable to app improve- ment and optimisation.

Details

Keynote paper at MobileSoft2016.
Download: full paper.

App Store Mining and Analysis (Keynote)

by A. A. Al-Subaihin, A. Finkelstein, M. Harman, Y. Jia, W. Martin, F. Sarro, Y. Zhang

Abstract. App stores are not merely disrupting traditional software deployment practice, but also offer considerable potential benefit to scientific research. Software engineering researchers have never had available, a more rich, wide and varied source of information about software products. There is some source code availability, supporting scientific investigation as it does with more traditional open source systems. However, what is important and different about app stores, is the other data available. Researchers can access user perceptions, expressed in rating and review data. Information is also available on app popularity (typically expressed as the number or rank of downloads). For more traditional applications, this data would simply be too commercially sensitive for public release. Pricing information is also partially available, though at the time of writing, this is sadly submerging beneath a more opaque layer of in-app purchasing. This talk will review research trends in the nascent field of App Store Analysis, presenting results from the UCL app Analysis Group (UCLappA) and others, and will give some directions for future work.

Details

Keynote paper at DeMobile2015.
Download: full paper.

Feature Lifecycles as They Spread, Migrate, Remain, and Die in App Stores

by F. Sarro, A. A. Al-Subaihin, M. Harman, Y. Jia, W. Martin, Y. Zhang

Abstract. We introduce a theoretical characterisation of feature lifecycles in app stores, to help app developers to identify trends and to find undiscovered requirements. To illustrate and motivate app feature lifecycle analysis, we used our theory to empirically analyse the migratory and non-migratory behaviours of 4,192 features from two App Stores (Samsung and Blackberry). The results reveal that, in both stores, intransitive features (those that neither migrate nor die out) exhibit statistically significantly different behaviours with regard to important properties, such as their price. Further correlation analysis also highlights differ- ences between behaviours relating price, rating and popularity. Our results indicate that feature lifecycle analysis can yield insights that may also help developers to understand feature requirement behaviours and attribute relationships.

Details

This work has been accepted at RE 2015.
Download: full paper, report containing all the results obtained in our analysis.
Data: The data used in this study can be requested here.

The App Sampling Problem for App Store Mining

by W. Martin, M. Harman, F. Sarro, Y. Jia, Y. Zhang

Abstract. Many papers on App Store Mining are susceptible to the App Sampling Problem, which exists when only a subset of apps are studied, resulting in potential sampling bias. We introduce the App Sampling Problem, and study its effects on sets of user review data. We investigate the effects of sampling bias, and techniques for its amelioration in App Store Mining and Analysis, where sampling bias is often unavoidable. We mine 106,891 requests from 2,729,103 user reviews and investigate the properties of apps and reviews from 3 different partitions: the sets with fully complete review data, partially complete review data, and no review data at all. We find that app metrics such as price, rating, and download rank are significantly different between the three completeness levels. We show that correlation analysis can find trends in the data that prevail across the partitions, offering one possible approach to App Store Analysis in the presence of sampling bias.

Details

This work has been accepted at MSR 2015.
Download: full paper
Data: The data used in this study can be requested here

App Store Mining and Analysis (MSR2012)

by M. Harman, Y. Jia, Y. Zhang

Abstract. This paper introduces app store mining and analysis as a form of software repository mining. Unlike other software repositories traditionally used in MSR work, app stores usually do not provide source code. However, they do provide a wealth of other information in the form of pricing and customer reviews. Therefore, we use data mining to extract feature information, which we then combine with more readily available information to analyse apps’ technical, customer and business aspects. We applied our approach to the 32,108 non-zero priced apps available in the Blackberry app store in September 2011. Our results show that there is a strong correlation between customer rating and the rank of app downloads, though perhaps surprisingly, there is no correlation between price and downloads, nor between price and rating. More importantly, we show that these correlation findings carry over to (and are even occasionally enhanced within) the space of data mined app features, providing evidence that our ‘App store MSR’ approach can be valuable to app developers.

Details

This work has been presented at MSR2012.
Download: full paper
Data: we provide figures from our analysis. The data used in this study can be requested here.

App Store Analysis: Relationships between Customer, Business and Technical Characteristics

by A. Finkelstein, M. Harman, Y. Jia, W. Martin, F. Sarro, Y. Zhang

Abstract. This paper argues that App Store Analysis can be used to understand the rich interplay between app customers and their developers. We use data mining to extract price and popularity information and natural language processing and data mining to elicit each app’s claimed features from the Blackberry App Store, revealing strong correlations between customer rating and popularity (rank of app downloads). We found evidence for a mild correlation between the number of features claimed for an app and its price, but we found little evidence for any other correlations in which price was a participant. We also found that free apps have significantly (p- value < 0.001) higher rating than non-free apps, with a moderately high effect size (Aˆ12 = 0.68). We also provide initial evidence that extracted claimed features are meaningful to developers (precision = 0.71, recall = 0.77). All data from our experiments and analysis are made available on-line to support further analysis.

Details

This work is currently under review. You can download the technical report here.
Data: we provide a report containing all the correlation graphs for each category used in our analysis. If the paper is accepted we will add the data from the paper to support replication.

Mining App Stores: Extracting Technical, Business and Customer Rating Information for Analysis and Prediction

by A. Finkelstein,M. Harman, Y. Jia, F. Sarro, Y. Zhang

Abstract. App development is an increasingly innovative and lucrative software industry. However, determining a suitable market price of an App is both demanding and critical; the comparatively low unit price, but considerable volume of sales dramatically increases the impact of miss-pricing. In this paper we leverage app store repository mining and machine learning, to automatically construct predictive models for this prediction problem. We implement and evaluate our approach on 9,588 non-free Apps from the Blackberry App Store, demonstrating that our approach statistically significantly outperforms existing approaches with at least medium effect size in 15 out of 17 (88%) of Blackberry App Store categories.

Details

This paper is currently under work. More details can be found in our technical report.
Download: technical report
Data: we provide figures from our analysis. If the paper is accepted we will add the data from the paper to support replication.

UCLappA

UCL App Store Analysis Group

Publications

Customer Rating Reactions Can Be Predicted Purely Using App Features

Details

Investigating the Relationship between Price, Rating, and Popularity in the BlackBerry World App Store

Details

Exploring the Effects of Ad Schemes on the Performance Cost of Mobile Phones

Details

Causal Impact Analysis for App Releases in Google Play

Details

Clustering Mobile Apps Based on Mined Textual Descriptions

Details

Mobile App and App Store Analysis, Testing and Optimisation (Keynote)

Details

App Store Mining and Analysis (Keynote)

Details

Feature Lifecycles as They Spread, Migrate, Remain, and Die in App Stores

Details

The App Sampling Problem for App Store Mining

Details

App Store Mining and Analysis (MSR2012)

Details

App Store Analysis: Relationships between Customer, Business and Technical Characteristics

Details

Mining App Stores: Extracting Technical, Business and Customer Rating Information for Analysis and Prediction

Details