- Analyzed language: JavaScript
- Difficulty level: 200
jQuery is an extremely popular, but old, open source JavaScript library designed to simplify things like HTML document traversal and manipulation, event handling, animation, and Ajax. The jQuery library supports modular plugins to extend its capabilities. Bootstrap is another popular JavaScript library, which has used jQuery's plugin mechanism extensively. However, the jQuery plugins inside Bootstrap used to be implemented in an unsafe way that could make the users of Bootstrap vulnerable to cross-site scripting (XSS) attacks. This is when an attacker uses a web application to send malicious code, generally in the form of a browser side script, to a different end user.
Four such vulnerabilities in Bootstrap jQuery plugins were fixed in this pull request, and each was assigned a CVE.
The core mistake in these plugins was the use of the omnipotent jQuery $
function to process the options that were passed to the plugin. For example, consider the following snippet from a simple jQuery plugin:
let text = $(options.textSrcSelector).text();
This plugin decides which HTML element to read text from by evaluating options.textSrcSelector
as a CSS-selector, or that is the intention at least. The problem in this example is that $(options.textSrcSelector)
will execute JavaScript code instead if the value of options.textSrcSelector
is a string like "<img src=x onerror=alert(1)>".
The values in options
cannot always be trusted.
In security terminology, jQuery plugin options are a source of user input, and the argument of $
is an XSS sink.
The pull request linked above shows one approach to making such plugins safer: use a more specialized, safer function like $(document).find
instead of $
.
let text = $(document).find(options.textSrcSelector).text();
In this challenge, we will use CodeQL to analyze the source code of Bootstrap, taken from before these vulnerabilities were patched, and identify the vulnerabilities.
To take part in the workshop you will need to follow these steps to get the CodeQL development environment set up:
- Install the Visual Studio Code IDE.
- Download and install the CodeQL extension for Visual Studio Code. Full setup instructions are here.
- Set up the starter workspace.
- Important: Don't forget to
git clone --recursive
orgit submodule update --init --remote
, so that you obtain the standard query libraries.
- Important: Don't forget to
- Open the starter workspace: File > Open Workspace > Browse to
vscode-codeql-starter/vscode-codeql-starter.code-workspace
. - Download the esbena_bootstrap-pre-27047_javascript CodeQL database.
- Unzip the database.
- Import the unzipped database into Visual Studio Code:
- Click the CodeQL icon in the left sidebar.
- Place your mouse over Databases, and click the + sign that appears on the right.
- Choose the unzipped database directory on your filesystem.
- Create a new file, name it
UnsafeDollarCall.ql
, save it undercodeql-custom-queries-javascript
.
If you get stuck, try searching our documentation and blog posts for help and ideas. Below are a few links to help you get started:
The workshop is split into several steps. You can write one query per step, or work with a single query that you refine at each step.
Each step has a Hint that describe useful classes and predicates in the CodeQL standard libraries for JavaScript and keywords in CodeQL. You can explore these in your IDE using the autocomplete suggestions (Ctrl+Space
) and jump-to-definition command (F12
).
Each step has a Solution that indicates one possible answer. Note that all queries will need to begin with import javascript
, but for simplicity this may be omitted below.
-
Find all function call expressions, such as
alert("hello world")
andspeaker.sayHello("world")
.Hint
A function call is called a
CallExpr
in the CodeQL JavaScript library.Solution
from CallExpr dollarCall select dollarCall
-
Identify the expression that is used as the first argument for each call, , such as
alert(<first-argument>)
andspeaker.sayHello(<first-argument>)
.Hint
- Add another variable to your
from
clause. This can be nameddollarArg
and have typeExpr
. - Add a
where
clause. CallExpr
has a predicategetArgument(int)
to find the argument at a 0-based index.
Solution
from CallExpr dollarCall, Expr dollarArg where dollarArg = dollarCall.getArgument(0) select dollarArg
- Add another variable to your
-
Filter your results to only those calls to a function named
$
, such as$("hello world")
andspeaker.$("world")
.Hint
CallExpr
has a predicategetCalleeName()
to find the name of the function being called.- Use the
and
keyword to add conditions to your query. - Use the
=
operator to assert that two values are equal.
Solution
from CallExpr dollarCall, Expr dollarArg where dollarArg = dollarCall.getArgument(0) and dollarCall.getCalleeName() = "$" select dollarArg
-
So far we have looked for the function name
$
. Are there other ways of calling the jQuery$
function? Perhaps the CodeQL library can handle these for us?The CodeQL standard library for JavaScript has a built-in predicate
jquery()
to describe references to$
. Expand the hint for details, and modify your query to use it.Hint
- Calling the predicate
jquery()
returns all values that refer to the$
function. - To find all calls to this function, use the predicate
getACall()
. - Notice that when you call
jquery()
,getACall()
, andgetAnArgument()
in succession, you get return values of typeDataFlow::Node
, notExpr
. These are data flow nodes. They describe a part of the source program that may have a value, and let us do more complex reasoning about this value. We'll learn more about these in the next section. - You can change your
dollarArg
variable to have typeDataFlow::Node
, or convert the data flow node back into anExpr
using the predicateasExpr()
.
Solution
from Expr dollarArg where dollarArg = jquery().getACall().getArgument(0).asExpr() select dollarArg
OR
from DataFlow::Node dollarArg where dollarArg = jquery().getACall().getArgument(0) select dollarArg
- Calling the predicate
jQuery plugins are usually defined by assigning a value to a property of the $.fn
object:
$.fn.copyText = function() { ... } // this function is a jQuery plugin
In this step, we will find such plugins, and their options.
Consider creating a new query for these next few steps, or commenting out your earlier solutions and using the same file. We will use the earlier solutions again in the next section.
-
You have already seen how to find references to the jQuery
$
function. Now find all places in the code that read the property$.fn
.Hint
- Declare a new variable of type `DataFlow::Node` to hold the results. - Notice that `jQuery()` returns a value of type `DataFlow::SourceNode`. Source nodes are places in the program that introduce a new value, from which the flow of data may be tracked. - `DataFlow::SourceNode` has a predicate named `getAPropertyRead(string)`, which finds all reads of a particular property on the same object. The string argument is the name of the property.Solution
from DataFlow::Node n where n = jquery().getAPropertyRead("fn") select n
-
Find the functions that are assigned to a property of
$.fn
. These are jQuery plugins.Remember the previous example:
$.fn.copyText = function() { ... } // this function is a jQuery plugin
There might be some variation in how this code is written. For example, we might see intermediate assignments to local variables:
let fn = $.fn let f = function() { ... } // this function is a jQuery plugin fn.copyText = f
The use of intermediate variables and nested expressions are typical source code examples that require use of local data flow analysis to detect.
Data flow analysis helps us answer questions like: does this expression ever hold a value that originates from a particular other place in the program?
We have already encountered data flow nodes, described by the
DataFlow::Node
CodeQL class. They are places in the program that have a value. They are returned by useful predicates likejquery()
in the library.These nodes are separate and distinct from the AST (Abstract Syntax Tree, which represents the basic structure of the program) nodes, to allow for flexibility in how data flow is modeled.
We can visualize the data flow analysis problem as one of finding paths through a directed graph, where the nodes of the graph are data flow nodes, and the edges represent the flow of data between those elements. If a path exists, then the data flows between those two nodes.
The CodeQL JavaScript data flow library is very expressive. It has several classes that describe different places in the program that can have a value. We have seen
SourceNode
s; there are many other forms such asValueNode
s,FunctionNode
s,ParameterNode
s, andCallNode
s. You can find our more in the documentation.When we are looking for the flow of information to or from these nodes within a single function or scope, this is called local data flow analysis. The CodeQL library has several predicates available on different types of data flow node that reason about local data flow.
You have already seen one such predicate:
SourceNode.getAPropertyRead()
. To complete this step of the workshop, look at the hint for another useful predicate.Hint
DataFlow::SourceNode
has a predicate namedgetAPropertySource()
, which finds a source node whose value is stored in a property of this node.- In the previous step, we used
getAPropertyRead(string)
to identify the source node$.fn
. Now try to find a value stored in a property of this source node$.fn
.
Solution
from DataFlow::Node plugin where plugin = jquery().getAPropertyRead("fn").getAPropertySource() select plugin
-
Find the last parameter of the jQuery plugin functions that you identified in the previous step. These parameters are the plugin options.
Hint
- Modify your
from
clause so that the variable that describes that jQuery plugin is of typeDataFlow::FunctionNode
. As the name suggests, this is a data flow node that refers to a function definition. DataFlow::FunctionNode
has a predicate namedgetLastParameter()
.- If you want to add a new variable to describe the parameter, it can be of type
DataFlow::ParameterNode
.
Solution
from DataFlow::FunctionNode plugin, DataFlow::ParameterNode optionsParam where plugin = jquery().getAPropertyRead("fn").getAPropertySource() and optionsParam = plugin.getLastParameter() select plugin, optionsParam
- Modify your
We have now identified (a) places in the program which receive jQuery plugin options (which may be untrusted data) and (b) places in the program which are passed to the jQuery $
function and may be interpreted as HTML. We now want to tie these two together to ask: does the untrusted data from a jQuery plugin option ever flow to the potentially unsafe $
call?
This is also a data flow problem. However, it is larger in scope that the problems we have tackled so far, because the plugin options and the $
call may be in different functions. We call this a global data flow problem.
In this section we will create a path problem query capable of looking for global data flow, by populating this template:
/**
* @name Cross-site scripting vulnerable plugin
* @kind path-problem
* @id js/xss-unsafe-plugin
*/
import javascript
import DataFlow::PathGraph
class Config extends TaintTracking::Configuration {
Config() { this = "Config" }
override predicate isSource(DataFlow::Node source) {
exists(/** TODO fill me in from Section 2 **/ |
source = /** TODO fill me in from Section 2 **/
)
}
override predicate isSink(DataFlow::Node sink) {
sink = /** TODO fill me in from Section 1 **/
}
}
from Config config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink, source, sink, "Potential XSS vulnerability in plugin."
-
Complete the
isSource
predicate using the query you wrote for Section 2.Hint
- You can translate from a query clause to a predicate by:
- Converting the variable declarations in the
from
part to the variable declarations of anexists
- Placing the
where
clause conditions (if any) in the body of the exists - Adding a condition which equates the
select
to one of the parameters of the predicate.
- Converting the variable declarations in the
- Remember that the source of untrusted data is the jQuery plugin options parameter.
Solution
override predicate isSource(DataFlow::Node source) { exists(DataFlow::FunctionNode plugin | plugin = jquery().getAPropertyRead("fn").getAPropertySource() and source = plugin.getLastParameter() ) }
- You can translate from a query clause to a predicate by:
-
Complete the
isSink
predicate by using the query you wrote for Section 1.Hint
- Complete the same process as above.
- We already found a
DataFlow::Node
in Section 1 as the result of callingjquery()
and predicates on it. - Remember that the first argument of a call to
$
is a sink for XSS vulnerabilities.
Solution
override predicate isSink(DataFlow::Node sink) { sink = jquery().getACall().getArgument(0) }
-
You can now run the completed query. This should find 5 results on the unpatched Bootstrap codebase.
Completed query
/** * @name Cross-site scripting vulnerable plugin * @kind path-problem * @id js/xss-unsafe-plugin */ import javascript import DataFlow::PathGraph class Configuration extends TaintTracking::Configuration { Configuration() { this = "XssUnsafeJQueryPlugin" } override predicate isSource(DataFlow::Node source) { exists(DataFlow::FunctionNode plugin | plugin = jquery().getAPropertyRead("fn").getAPropertySource() and source = plugin.getLastParameter() ) } override predicate isSink(DataFlow::Node sink) { sink = jquery().getACall().getArgument(0) } } from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink where cfg.hasFlowPath(source, sink) select sink, source, sink, "Potential XSS vulnerability in plugin."
We have created a query from scratch to find this problem. A production version of this query can be found as part of the default set of CodeQL security queries: UnsafeJQueryPlugin.ql. You can see the results on a vulnerable copy of Bootstrap that has been analyzed on LGTM.com, our free open source analysis platform.
- Read the tutorial on analyzing data flow in JavaScript and TypeScript.
- Try out the latest CodeQL Capture-the-Flag challenge on the GitHub Security Lab website for a chance to win a prize! Or try one of the older Capture-the-Flag challenges to improve your CodeQL skills.
- Try out a CodeQL course on GitHub Learning Lab.
- Read about more vulnerabilities found using CodeQL on the GitHub Security Lab research blog.
- Explore the open-source CodeQL queries and libraries, and learn how to contribute a new query.
This is a reduced version of a Capture-the-Flag challenge devised by @esbena, available at https://securitylab.github.com/ctf/jquery. Try out the full version! Thanks to our moderators for valuable feedback on the workshop.