Examples

Real-world usage examples for security research and code analysis.

!!! note "Command Usage" If installed via pip install scanipy-cli, use scanipy command. If running from source, use python scanipy.py instead.

Security Research

Command Injection

Find potential command injection vulnerabilities:

scanipy --query "os.system" --language python \
  --keywords "user,input,request" --run-semgrep

SQL Injection

Find potential SQL injection:

scanipy --query "execute(" --language python \
  --keywords "format,user,%s" --run-semgrep

Unsafe Deserialization

Find unsafe pickle usage:

scanipy --query "pickle.loads" --language python --run-semgrep

Path Traversal (Tarslip)

Find path traversal vulnerabilities in archive extraction:

scanipy --query "extractall" --language python \
  --run-semgrep --rules ./tools/semgrep/rules/tarslip.yaml

With CodeQL for deeper analysis:

scanipy --query "extractall" --language python --run-codeql \
  --codeql-queries "codeql/python-queries:Security/CWE-022/TarSlip.ql"

Hardcoded Secrets

Find potential hardcoded credentials:

scanipy --query "password =" --language python \
  --keywords "secret,api_key,token" --run-semgrep

Code Pattern Analysis

Deprecated API Usage

Find deprecated urllib2 usage:

scanipy --query "urllib2" --language python

Library Usage

Find specific library usage in popular repos:

scanipy --query "import tensorflow" --language python \
  --search-strategy tiered

Advanced Filtering

Exclude Organizations

Search but exclude specific organizations:

scanipy --query "eval(" \
  --additional-params "stars:>1000 -org:microsoft -org:google"

High-Star Repos Only

Focus on very popular repositories:

scanipy --query "subprocess.Popen" --language python \
  --additional-params "stars:>10000"

Combined Filters

scanipy \
  --query "subprocess" \
  --language python \
  --keywords "shell=True,user" \
  --pages 10 \
  --search-strategy tiered \
  --run-semgrep

Workflow Examples

Research Workflow

Search and save results:

scanipy --query "extractall" --language python \
  --output tarslip_repos.json

Review results, then run analysis:

scanipy --query "extractall" \
  --input-file tarslip_repos.json \
  --run-semgrep --rules ./tools/semgrep/rules/tarslip.yaml

Run CodeQL for deeper analysis:

scanipy --query "extractall" --language python \
  --input-file tarslip_repos.json \
  --run-codeql --codeql-output-dir ./tarslip_sarif

Long-Running Analysis

For large-scale analysis with resume capability:

# Start analysis (can be interrupted)
scanipy --query "eval(" --language python \
  --pages 10 \
  --run-semgrep \
  --results-db ./eval_analysis.db \
  --keep-cloned \
  --clone-dir ./eval_repos

# Resume if interrupted
scanipy --query "eval(" --language python \
  --pages 10 \
  --run-semgrep \
  --results-db ./eval_analysis.db \
  --keep-cloned \
  --clone-dir ./eval_repos

Multi-Tool Analysis

Run both Semgrep and CodeQL on the same repositories:

# First, search and run Semgrep
scanipy --query "extractall" --language python \
  --run-semgrep \
  --keep-cloned \
  --clone-dir ./repos \
  --output repos.json

# Then run CodeQL on the same repos
scanipy --query "extractall" --language python \
  --input-file repos.json \
  --run-codeql \
  --clone-dir ./repos \
  --codeql-output-dir ./sarif_results

Language-Specific Examples

JavaScript/TypeScript

scanipy --query "eval(" --language javascript --run-codeql

Java

scanipy --query "Runtime.exec" --language java --run-codeql

Go

python scanipy.py --query "os/exec" --language go --run-codeql

C/C++

python scanipy.py --query "strcpy" --language c --run-codeql \
  --codeql-queries "cpp-security-extended"

Resuming Interrupted Analysis

Both Semgrep and CodeQL support resuming interrupted analysis. This is useful for large scans that may be interrupted.

Resume Semgrep Analysis

# Start analysis with database tracking
scanipy --query "SQL injection" --language python \
  --run-semgrep --results-db sql_injection.db

# If interrupted (Ctrl+C, network error, etc.)
# Resume from where you left off
scanipy --query "SQL injection" --language python \
  --run-semgrep --results-db sql_injection.db --resume

Resume CodeQL Analysis

# Start CodeQL analysis with database tracking
scanipy --query "path traversal" --language python \
  --run-codeql --codeql-results-db path_analysis.db

# Resume interrupted analysis
scanipy --query "path traversal" --language python \
  --run-codeql --codeql-results-db path_analysis.db --codeql-resume

Large-Scale Analysis Workflow

For analyzing hundreds of repositories:

# Day 1: Start large scan (100+ repos)
scanipy --query "unsafe deserialization" --language java \
  --pages 10 --run-codeql --codeql-results-db deserialization_scan.db

# Analysis interrupted after 40 repositories...

# Day 2: Resume (skips first 40, continues with remaining)
scanipy --query "unsafe deserialization" --language java \
  --pages 10 --run-codeql --codeql-results-db deserialization_scan.db \
  --codeql-resume

# Completed: All 100 repositories analyzed

Key Points

Session Matching: Resume works by matching query, language, and analysis parameters
Automatic Skipping: Already-analyzed repositories are automatically skipped
Incremental Saves: Results are saved after each repository
Crash Recovery: Analysis survives Ctrl+C, network errors, or system crashes

Examples

Examples

Security Research

Command Injection

SQL Injection

Unsafe Deserialization

Path Traversal (Tarslip)

Hardcoded Secrets

Code Pattern Analysis

Deprecated API Usage

Library Usage

Recently Updated

Advanced Filtering

Exclude Organizations

High-Star Repos Only

Combined Filters

Workflow Examples

Research Workflow

Long-Running Analysis

Multi-Tool Analysis

Language-Specific Examples

JavaScript/TypeScript

Java

Go

C/C++

Resuming Interrupted Analysis

Resume Semgrep Analysis

Resume CodeQL Analysis

Large-Scale Analysis Workflow

Key Points

Related Documents

Multi-Protocol GitHub Worker - Usage Examples

GitHub MCP Server - Code Execution Examples

Example Conversations