Cross-Site Scripting (XSS) front-end security details for Ruby on Rails developers
Cross-Site Scripting is a security hole that allows attackers to inject and execute JavaScript on your website.
Applications process data using different programming languages and formats (for example Ruby, JavaScript, SQL; plain text, HTML, JSON, CSV). Data moves from one context into another context because languages and formats are nested or chained.
Data that has a specific meaning in one context gets different meaning when put into another context.In context one, data is just plain text. In another context , it may be interpreted as code.
Untrusted content
Web applications deal with untrusted content all the time. This is data that isn’t created by the service provider, developers or trusted parties. It may contain errors, it may be incomplete, it may not comply with syntactical rules. In addition, it may contain malicious code.
Sources of untrusted content include:
An important rule of web application development is: Always mistrust user input!
Code injection
Untrusted content can cause processing errors in the back-end and front-end, but why is it a security concern?
Untrusted content gets into the database and eventually into the HTML, CSS or JavaScript code. If not treated correctly, this is a possible code injection.
Code injection is a serious security threat, especially the injection of JavaScript code. The injected code typically runs with the same privileges as the developer’s code. Such code can hijack user sessions, forge HTTP requests, read and expose private data, change content, spread misinformation, steal money etc.
For example: When does a string literal start and end? Typically, there is a delimiter character that marks the beginning and end of a string. In Ruby and JavaScript, this is a single or double quotation mark. Example:
"A string uses specific delimiters."
The parser reads this code, recognizes the delimiters and treats characters in between as part of the string, not as code.
But what happens if the string contains the delimiters? This won’t work:
"A String that contains "delimiters"."
This code has a syntax error because the second quotation mark already terminates the string. The parser would try to process “delimiters” as code again.
The solution is:
"The \"delimiters\" need to be escaped with a slash. The escape character \\ needs to be escaped as well."
The resulting string is “The "delimiters" need to be escaped with a slash. The escape character \ needs to be escaped as well.”
These character sequences starting with a slash are called escape sequences. They tell the parser to treat the following character verbatim and not as code. They neutralize the special meaning of the character.
Language nesting
A typical Ruby on Rails software stack nests and chains languages – one language generates another or is translated into another language. For example, HAML is compiled to HTML and can contain Ruby, CSS and JavaScript. Sass is compiled to CSS; CoffeeScript to JavaScript. JavaScript itself can contain HTML and CSS. Most of these languages may contain common formats like URL, JSON and SVG.
Language nesting is a potential security problem because data changes its context and needs to be treated correctly to prevent errors and code injection.
A typical Ruby on Rails application uses the template languages ERB and HAML. Since they concatenate strings to generate HTML, they just make a vague guess about the target context. They treat HTML as one context, which it isn’t, as we will see later.
HAML understands the HTML syntax a bit better than ERB. It can distinguish between elements, attributes and text content. For safe embedding of JavaScript and CSS, it has filters like :javascript and :css.
A template language designed with security in mind should know the different contexts of the target language so it can escape appropriately. For example, an XML-/XSLT-based template language is parsed into to a tree. The processor is able to understand the nesting of languages correctly, for example JavaScript embedded into HTML.
ERB and HAML perform this HTML escaping per default.
Let’s assume there is malicious input with HTML and JavaScript code:
input = "<script>alert('XSS')</script>"
In the ERB template, the input is written to the document:
<p><%= input %></p>
Generated output:
<p><script>alert('XSS')</script></p>
Thanks to ERB’s automatic HTML escaping, the script injection was prevented. But usually it’s more complicated
Reflected XSS
Code is injected using the HTTP request and only present in the associated HTTP response. The attack vector is mostly the URL. All users are affected which open a crafted link that contains the injected code.
Reflected XSS is typically considered as the less severe type, but don’t underestimate it. Social media and e-mail spam make it easy to spread prepared URLs.
Let’s have a look at a simple example of Reflected XSS. Assume there is a URL that contains malicious code (HTML with JavaScript) in the query string:
http://example.com/?id=<script>alert(1)</script>
Assume there is a PHP script on the server that outputs the input without context-specific escaping:
This creates a Reflected XSS hole because the server “reflects” the input in the output without filtering malicious code.
This is just a simple example – most of the time it’s more complex and the security vulnerability is not that obvious.
Modern browsers try to mitigate Reflected XSS by refusing to execute JavaScript code that originates from the URL or from form data. Browsers get suspicious when both input and output contain the same JavaScript code.
Persistent XSS (aka Stored XSS)
Persistent XSS means that malicious code is stored on the server, for example in the database, and is sent to other users with every response to a specific URL. Therefore, Persistent XSS potentially affects all users visiting a site. In contrast to Reflected XSS, the malicious code doesn’t need to be part of each request once it has been placed on the server.
There are multiple attack vectors for Persistent XSS. Data from all parts of the HTTP request (the URL, headers like “Cookie”, form data…) can be harmful when it is stored on the server and output again without treatment.
Also content that is loaded from third parties, especially HTML and JavaScript, may inject code persistently. This includes JavaScript libraries loaded from Content Delivery Networks (CDN), as well as advertisement and web analytics scripts.
Rails does not save us from XSS holes
Rails 4, ERB and HAML have good defaults that prevent simple XSS attacks. They create SafeBuffers and HTML-escape input per default. But the devil lies in the detail. Most likely all non-trivial Ruby on Rails application are affected by XSS, we just don’t know yet because such holes aren’t easy to find.
Places where XSS holes hide in a Rails application:
The cause of the problems: Data changes context
XSS is a very specific problem, but it’s caused by a general issue that affects all computer systems and programming languages:Applications process data using different programming languages and formats (for example Ruby, JavaScript, SQL; plain text, HTML, JSON, CSV). Data moves from one context into another context because languages and formats are nested or chained.
Data that has a specific meaning in one context gets different meaning when put into another context.In context one, data is just plain text. In another context , it may be interpreted as code.
Untrusted content
Web applications deal with untrusted content all the time. This is data that isn’t created by the service provider, developers or trusted parties. It may contain errors, it may be incomplete, it may not comply with syntactical rules. In addition, it may contain malicious code.
Sources of untrusted content include:
- Everything in the HTTP request:
- URL: path, query string parameters etc.
- Headers: cookies, user agent etc.
- Request body: form data with user input, uploaded files etc.
- Data from third-party web services and APIs
An important rule of web application development is: Always mistrust user input!
Code injection
Untrusted content can cause processing errors in the back-end and front-end, but why is it a security concern?
Untrusted content gets into the database and eventually into the HTML, CSS or JavaScript code. If not treated correctly, this is a possible code injection.
Code injection is a serious security threat, especially the injection of JavaScript code. The injected code typically runs with the same privileges as the developer’s code. Such code can hijack user sessions, forge HTTP requests, read and expose private data, change content, spread misinformation, steal money etc.
Language syntax and escaping
To understand the background of XSS, we need to understand the nesting of data. Every programming language and data format has this problem in its own syntax.For example: When does a string literal start and end? Typically, there is a delimiter character that marks the beginning and end of a string. In Ruby and JavaScript, this is a single or double quotation mark. Example:
"A string uses specific delimiters."
The parser reads this code, recognizes the delimiters and treats characters in between as part of the string, not as code.
But what happens if the string contains the delimiters? This won’t work:
"A String that contains "delimiters"."
This code has a syntax error because the second quotation mark already terminates the string. The parser would try to process “delimiters” as code again.
The solution is:
"The \"delimiters\" need to be escaped with a slash. The escape character \\ needs to be escaped as well."
The resulting string is “The "delimiters" need to be escaped with a slash. The escape character \ needs to be escaped as well.”
These character sequences starting with a slash are called escape sequences. They tell the parser to treat the following character verbatim and not as code. They neutralize the special meaning of the character.
Language nesting
A typical Ruby on Rails software stack nests and chains languages – one language generates another or is translated into another language. For example, HAML is compiled to HTML and can contain Ruby, CSS and JavaScript. Sass is compiled to CSS; CoffeeScript to JavaScript. JavaScript itself can contain HTML and CSS. Most of these languages may contain common formats like URL, JSON and SVG.
Language nesting is a potential security problem because data changes its context and needs to be treated correctly to prevent errors and code injection.
A typical Ruby on Rails application uses the template languages ERB and HAML. Since they concatenate strings to generate HTML, they just make a vague guess about the target context. They treat HTML as one context, which it isn’t, as we will see later.
HAML understands the HTML syntax a bit better than ERB. It can distinguish between elements, attributes and text content. For safe embedding of JavaScript and CSS, it has filters like :javascript and :css.
A template language designed with security in mind should know the different contexts of the target language so it can escape appropriately. For example, an XML-/XSLT-based template language is parsed into to a tree. The processor is able to understand the nesting of languages correctly, for example JavaScript embedded into HTML.
General HTML escaping
In HTML element content and attribute values, some characters have a special meaning. They need to be escaped so the browser processes them as plain text, not markup. Replace these characters with character references, either entity references or numerical references
Character | Escaped character |
---|---|
< | < or < |
> | > or > |
" | " or " |
' | ' or ' |
& | & or & |
Let’s assume there is malicious input with HTML and JavaScript code:
input = "<script>alert('XSS')</script>"
In the ERB template, the input is written to the document:
<p><%= input %></p>
Generated output:
<p><script>alert('XSS')</script></p>
Thanks to ERB’s automatic HTML escaping, the script injection was prevented. But usually it’s more complicated
Types of XSS
This presentation won’t mention all XSS principles, but we need to distinguish between two types of XSS:Reflected XSS
Code is injected using the HTTP request and only present in the associated HTTP response. The attack vector is mostly the URL. All users are affected which open a crafted link that contains the injected code.
Reflected XSS is typically considered as the less severe type, but don’t underestimate it. Social media and e-mail spam make it easy to spread prepared URLs.
Let’s have a look at a simple example of Reflected XSS. Assume there is a URL that contains malicious code (HTML with JavaScript) in the query string:
http://example.com/?id=<script>alert(1)</script>
Assume there is a PHP script on the server that outputs the input without context-specific escaping:
This creates a Reflected XSS hole because the server “reflects” the input in the output without filtering malicious code.
This is just a simple example – most of the time it’s more complex and the security vulnerability is not that obvious.
Modern browsers try to mitigate Reflected XSS by refusing to execute JavaScript code that originates from the URL or from form data. Browsers get suspicious when both input and output contain the same JavaScript code.
Persistent XSS (aka Stored XSS)
Persistent XSS means that malicious code is stored on the server, for example in the database, and is sent to other users with every response to a specific URL. Therefore, Persistent XSS potentially affects all users visiting a site. In contrast to Reflected XSS, the malicious code doesn’t need to be part of each request once it has been placed on the server.
There are multiple attack vectors for Persistent XSS. Data from all parts of the HTTP request (the URL, headers like “Cookie”, form data…) can be harmful when it is stored on the server and output again without treatment.
Also content that is loaded from third parties, especially HTML and JavaScript, may inject code persistently. This includes JavaScript libraries loaded from Content Delivery Networks (CDN), as well as advertisement and web analytics scripts.
Rails does not save us from XSS holes
Rails 4, ERB and HAML have good defaults that prevent simple XSS attacks. They create SafeBuffers and HTML-escape input per default. But the devil lies in the detail. Most likely all non-trivial Ruby on Rails application are affected by XSS, we just don’t know yet because such holes aren’t easy to find.
Places where XSS holes hide in a Rails application:
- Rails view helpers that create HTML code dynamically, but do not correctly escape the input data.
- User-generated HTML isn’t filtered correctly, for example from a web-based rich text editor.
- HTML is crawled from a third-party API and embedded into the page without filtering. To attack a well-secured site by XSS, an attacker just needs to compromise the weakest third-party script provider
Comments
Post a Comment