Monday, October 23, 2023

Semgrep Rules for Android Application Security


The number of Android applications has been growing rapidly in recent years. In 2022, there were over 3.55 million Android apps available in the Google Play Store, and this number is expected to continue to grow in the years to come. The expansion of the Android app market is being driven by a number of factors, including the increasing popularity of smartphones, the growing demand for mobile apps, and the ease of developing and publishing Android apps. At the same time, the number of Android app
downloads is also growing rapidly. In 2022, there were over 255 billion Android app downloads worldwide.

For this reason, introducing automatic security controls during Mobile Application Penetration Testing (MAPT) activity and the CI phase is necessary to ensure the security of Android apps by scanning for vulnerabilities before merging into the main repository.

Decompiling Android Packages

The compilation of Android applications is a multi-step process that involves different bytecodes, compilers, and execution engines. Generally speaking, a common compilation flow is divided into three phases:

  1. Precompilation: The Java source code(".java") is converted into Java bytecode(".class").
  2. Postcompilation: The Java bytecode is converted into Dalvik bytecode(".dex").
  3. Release: The ".dex" and resource files are packed, signed and compressed into the Android App Package (APK)

Finally, the Dalvik bytecode is executed by the Android runtime (ART) environment.

Generally, the target of a Mobile Application Penetration Testing (MAPT) activity is in the form of an APK file. The decompilation of the both aforementioned bytecodes is possible and can be performed through the use of tools such as Jadx.

jadx -d ./out-dir target.apk


The OWASP MAS project is a valuable resource for mobile security professionals, providing a comprehensive set of resources to enhance the security of mobile apps. The project includes several key components:
  • OWASP MASVS: This resource outlines requirements for software architects and developers who aim to create secure mobile applications. It establishes an industry standard that can be used as a benchmark in mobile app security assessments. Additionally, it clarifies the role of software protection mechanisms in mobile security and offers requirements to verify their effectiveness.
  • OWASP MASTG: This comprehensive manual covers the processes, techniques, and tools used during mobile application security analysis. It also includes an exhaustive set of test cases for verifying the requirements outlined in the OWASP Mobile Application Security Verification Standard (MASVS). This serves as a foundational basis for conducting thorough and consistent security tests.
  • OWASP MAS Checklist: This checklist aligns with the tests described in the MASTG and provides an output template for mobile security testing.


Semgrep is a Static Application Security Testing (SAST) tool that performs intra-file analysis, allowing you to define code patterns for detecting misconfigurations or security issues by analyzing one file at a time in isolation. Some advantages of using Semgrep include:

  • It does not require that the source code is uploaded to an external cloud.
  • It does not require that the target source code is buildable and have all dependencies. It can work only with a single source file.
  • It is exceptionally fast.
  • It allows you to write your custom patterns very easily.
Once Semgrep is integrated into your CI pipeline, it automatically scans your code for potential vulnerabilities every time you commit changes. This helps identify and address vulnerabilities early in the development process, improving your software's security.

Key Insights on Semgrep

First of all, install Semgrep with the following command:
python3 -m pip install semgrep
Semgrep accepts two fundamental input:
  • Rules collection: A collection is composed by ".yaml" files, alternatively referred to as "rules". A rule includes a series of patterns designed to identify or exclude specific elements within the target source code.
  • Target source code: This denotes the source code subject to analysis. It may also encompass partial code or code with certain dependencies omitted.
The four main elements you can find inside a Semgrep rule yaml file are:
...Match a sequence of zero or more items such as arguments, statements, parameters, fields, characters.
"..."Match any single hardcoded string.
$AMatch variables, functions, arguments, classes, object methods, imports, exceptions, and more.
<... e ...>Match an expression ("e") that could be deeply nested within another expression.

Moreover, Semgrep provides several experimental modes that could be really useful in more difficult situations:
  • taint: It enables the data-flow analysis feature allowing to specify sources and sinks.
  • join: It allows to use multiple rules on more than one file and to join the results.
  • extract: It allows work with source file that contains different programming languages.
Suppose to have a rules collection in the directory "myrules/" and a target source code "mytarget/". To launch a Semgrep scan is very simple:
semgrep -c ./myrules ./mytarget