Google Gruyere

Let’s practice some Web Hacking. For this purpose, there is a good resource developed by Google. It’s a Website called Google Gruyere; it is a Hacking Lab, and as per it’s name, it is riddled with vulnerabilities. Link : https://google-gruyere.appspot.com/

This website mimicks the principles of a very basic social network, where you can create a user profile (name, photo, pinned message and website…), manage it, and post some short messages (snippets in this case), so it really makes sense as a study material

The Lab shows how web application vulnerabilities can be exploited and how to defend against these attacks. Among other Challenges, we will practice cross-site scripting (XSS), cross-site request forgery (XSRF),…and also get an opportunity to assess the impacts of such vulnerabilities ( denial-of-service, information disclosure, remote code execution…)

Some of these Challenges can be solved by using black box techniques, other Challenges will require to look at the Gruyere source code (that’s why Google provides both client side and server side code along with the Lab). The code is written in Python. Reading through the code will help build a good understanding how the vulnerabilities work. The code relies on Templates – Gruyere Template Language or GTL – , and in this respect, look similar to Django (https://www.djangoproject.com/start/overview/)

Google provides all the solutions on the Gruyere site, so I’m not going to provide new solutions but rather walk through the proposed solutions


Introduction to Gruyere

The Lab looks like this when you launch it for the first time (it will create an incremental session number, specific to you)

As a warm-up, we are requested to perform a few basic tasks, to gain a first understanding of the user interface :

  • View another user’s snippets by following the “All snippets” link on the main page. Also check out what they have their Homepage set to
  • Sign up for an account for yourself to use when hacking
  • Fill in your account’s profile, including a private snippet and an icon that will be displayed by your name
  • Create a snippet (via “New Snippet”) containing your favorite joke
  • Upload a file (via “Upload”) to your account

This is what my login page now looks like :

About snippets : some of you may not know where this comes from. It is the usual word used by Google to highlight a summary text in the Google search engine result. In our context, a snippet refers to a small bit of text added as a tag, after a user name

Before going through the Challenges, let’s have a first look into the code (I’m using the code editor “Sublime Text”)

Here is a short explanation about the Gruyere modules (we will come back to it with deeper analysis during the Challenges) :

data.py stores the default data in the database. There is an administrator account and three default users

gruyere.py is the main Gruyere web server

The code enables the setup of a local server with the necessary functionalities (creation of a working directory, installation of a database, cookie management, URL/HTML responses, management of user profile, data/file upload,…)

Here are important text comments included in the code, it helps understand the server logic and limitations

gtl.py is the Gruyere Template Language

Gruyere Template Language (GTL) is a new template language, and as its siblings such as Django, it helps create web pages more efficiently. Documentation for GTL can be found directly in gruyere/gtl.py

Most of the Gruyere resources are written using GTL

sanitize.py is the Gruyere module used for sanitizing HTML, to protect the application from security holes

HTML sanitization is the process of examining an HTML document and producing a new one that preserves only whatever tags are designated “safe” and desired. HTML sanitization can be used to protect against attacks such as cross-site scripting (XSS) by sanitizing any HTML code submitted by a user. For example, tags such as <script> are usually removed during the sanitizing process. The process is usually based upon a white list of allowed tags, and a black list of disallowed tags

In our case, here are the allowed/disallowed tags

resources directory holds all CSS code, images, template files (it will provide important functionalities such as account creation, login process, user profile, snippets, file upload), and a Javascript library (for snippets user interaction and refresh)


Data Sanitization and Escaping

We should always be very carefull with user inputs on our websites. Some users will try to “inject code” in our website, using different tricks (as we will see below in the following Challenges)

The root cause of code injection vulnerabilities is the mixing of code and data which is then handed to a browser. Injection is possible when the data is treated as code

Data sanitization and Escaping is a mitigation for code injection vulnerabilities

Sanitization involves removing characters entirely in order to make the value “safe”. Sanitization is difficult to do correctly, that’s why most sanitization implementations have seen a number of bypasses

The term “Escaping” originates from situations where text is being interpreted in some mode and we want to “escape” from that mode into a different mode

For example, you want to tell your terminal to switch from interpreting a sequence of code to text

One common situation is when a developer needs to tell a browser to not interpret a value as code but as text

Please keep in mind these concepts as they will be important for the Challenges

If you want to go deeper on these topics before starting the Challenges, you could check this OWASP video about “killing injection vulnerabilities


Cross-Site Scripting (XSS)

Cross-site scripting (XSS) is a vulnerability that permits an attacker to inject code (typically HTML or JavaScript) into contents of a website not under the attacker’s control. When a victim views such a page, the injected code executes in the victim’s browser. Thus, the attacker can steal victim’s private information associated with the website in question

File Upload XSS

This attack is well documented by the OWASP Foundation : https://bit.ly/3xVUjqE

We can create the following file, including a warning message “Forensicxs hacked you”, and a script – Javascript – that will pop-up the content of the session cookie

Here below a bit more information about cookies and the potential use of document.cookie

https://javascript.info/cookie

Let’s upload our HTML file into Gruyere. Once the upload is completed, we are provided with an https link to access our file directly from the browser

Let’s follow this link in our browser. We get the alert(document.cookie) pop-up window displayed. It provides the content of our session cookie

alert(document.cookie) pop-up window

When we click OK, we land on our HTML message “Forensicxs hacked you”. In developer mode, we can confirm the cookie value

The cookie content is consistent with the server side code

function CreateCookie in gruyere.py

Our script is very basic, but we could write a more sophisticated one, that would be able to steal user informations contained in the cookie. The next step of the hack is to make the upload link (https://bit.ly/3jYcVl7) available to our victim, and once he clicks on it, this will run the script on his session and steal the data as per the script (with our basic script, we will not be able to retrieve the user data). Since the file we uploaded is on a trusted site, a user will more likely trust the link and click on it. As a summary, the attack looks as follows :

https://bit.ly/3gaXmoY

There are many countermeasures detailed by the OWASP : https://bit.ly/3xVUjqE. Here below some important ones :

  • host the content on a separate domain so the script won’t have access to any content from your domain. That is, instead of hosting user content on example.com/username we would host it at username.usercontent.example.com or username.example-usercontent.com. (Including something like “usercontent” in the domain name avoids attackers registering usernames that look innocent like wwww and using them for phishing attacks.)
  • the application should perform filtering and content checking on any files which are uploaded to the server. Files should be thoroughly scanned and validated before being made available on the server. If in doubt, the file should be discarded

We can see in the Gruyere code that none of these protections are included in the server side

Reflected XSS

This attack is well documented by the OWASP Foundation : https://bit.ly/3CUevwZ

Reflected cross-site scripting arises when an application receives data in an HTTP request and includes that data within the immediate response in an unsafe way. For example :

https://bit.ly/3xZzivr

The questions guide us to type in “invalid” in the adress bar. We get an error message as this adress leads to nothing on Gruyere Website

And we can see in the Developer tools that the code has been inserted

In a reflected XSS, an attacker forces the web-application to return an error search result, or any other response that includes some or all of the input provided by the user as part of the request, without that data being made safe to render in the browser, and without permanently storing the user provided data

The questions provide us tips about potential code to type in. Let’s type again a script such as this one : <script>alert(document.cookie)</script>

Our session cookie is displayed in the message box, and we can also see the script inserted in the Elements inspector

This is the demonstration of the Reflected XSS. We could pass this link to a victim and steal its session informations (but we would need a more complex script to retrieve the victim’s data)

Now, let’s have a look at the flaws in the code. One issue is that Gruyere sends us an error message, but the script is included in the output rendered code (as seen before in the Elements inspector)

As per Google statement in the Challenge, one fix is to escape the user input that is displayed in error messages. Error messages are displayed using error.gtl, but are not escaped in the template. The part of the template that renders the message is {{message}} and it’s missing the modifier that tells it to escape user input. Add the :text modifier to escape the user input. This is called manual escaping

<div class="message">{{_message:text}}</div>

I try to fix this issue by downloading the Gruyere code, modifying it and running it locally

Modification of the code
Running Gruyere locally (the session code is new as I switched to Linux during the Challenge)

We can see that the script is escaped to text and the script is not executed

The simplest and best means to protect an application and their users from XSS bugs is to use a web template system or web application development framework that auto-escapes output and is context-aware

Auto-escaping” refers to the ability of a template system or web development framework to automatically escape user input in order to prevent any scripts embedded in the input from executing. If you wanted to prevent XSS without auto-escaping, you would have to manually escape input; this means writing your own custom code (or call an escape function) everywhere your application includes user-controlled data. In most cases, manually escaping input is not recommended

Context-aware” refers to the ability to apply different forms of escaping based on the appropriate context. Because CSS, HTML, URLs, and JavaScript all use different syntax, different forms of escaping are required for each context

Here in the link an article about how the template language Django handles autoescaping and various other XSS protections : https://bit.ly/3sGT6CU. There is no such autoescaping in the GTL language

Stored XSS

This attack is well documented by the OWASP Foundation : https://bit.ly/3CUevwZ

As induced by the questions, we can input the following script in the “New Snippet” form

After clicking Submit, we can hover our mouse on the “read this” link, and again, we see a pop-up window with our session cookie

We can see that our script has been embedded in the code

Any user who would hover its mouse on my snippet would face a risk (but actually stealing any user data would require a more complex script). So, what is wrong with the code ? We do have a Sanitizer, but obviously it’s too weak to catch this threat

The call to the Sanitizer is done in this piece of code, inside the Template language GTL

gtl.py

Let’s have a deeper look at sanitize.py

sanitize.py

The attribute “onmouseover” is not in the disallowed attributes. That’s an obvious flaw. Let’s add this attribute in the code and let’s test again

Now, the script is not executed, it’s blocked by this additional disallowed attribute. But, if write the “ONMOUSEOVER” in capital letters, the script is executed again

So we can see how the Sanitizer is sensitive to upper/lower cases and probably other parameters. Developping our own Sanitizer is surely not a robust method

As per Google, the right approach to HTML sanitization is to :

  • Parse the input into an intermediate DOM structure, then rebuild the body as well-formed output
  • Use strict whitelists for allowed tags and attributes
  • Apply strict sanitization of URL and CSS attributes if they are permitted

This is done using a proven HTML sanitizer, such as this one : https://bit.ly/3sx15SW

Stored XSS via HTML Attribute

Let’s start with a usefull reminder about HTML attributes : https://bit.ly/2W5b9a5

Here is an example based upon the style attribute, allowing to add style to an element, such as color

Now, let’s start the Challenge. My profile color is green

We are totally guided here and just need to type the following script in the profile box

The script is launched each time the mouse hovers over the profile name

Now, let’s check in the code where this flaw is occuring. There are two noticeable code sections where colours are managed. The first one is in “home.gtl”. We can see that the color parameter is escaped as follows : {{color:text}}

home.gtl

The second one is in “editprofile.gtl”. The color parameter is also escaped as follows :{{_profile.color:text}}

editprofile.gtl

So, how come the script is executed ? The bug is in this code section of glt.py

gtl.py

What is CGI ? Common Gateway Interface (CGI) is an interface specification that enables web servers to execute an external program, typically to process user requests

A typical use case occurs when a Web user submits a Web form on a web page that uses CGI. The form’s data is sent to the Web server within an HTTP request with a URL denoting a CGI script. The Web server then launches the CGI script in a new computer process, passing the form data to it. The output of the CGI script, usually in the form of HTML, is returned by the script to the Web server, and the server relays it back to the browser as its response to the browser’s request

CGI is a rather old technology. It requires some coding effort for complex websites. Nowadays, the Frameworks such a Django (Python) will manage the link between the Front-End and Back-End more efficiently

Now, back to the code. During a user input, cgi.escape is designed to escape the user HTML attribute, considering a double quote “, as per this example

code with double quote “

But in our script, we have just put single quotes ‘

code with single quote ‘

That’s why the script is slipping through the cgi.escape function and is executed

To be noted : cgi.escape will never escape single quotes

https://www.ietf.org/rfc/rfc3875

Let’s improve the escape function, that will escape single and double quotes too. As per Google recommendation, let’s add this function to gtl.py : _EscapeTextToHtml()

gtl.py

The single quote escape is this line : ‘\”: ‘&#39;’. You can find some further explanations in this article : https://bit.ly/3j2tPjk

Let’s replace cgi.escape by _EscapeTextToHtml()

gtl.py

This is, in my case, efficiently blocking the script (my browser : Firefox in Kali Linux)

The rest of the questions are applicable for those of you still running a deprecated version of Internet Explorer, which had some flaws in dealing with CSS dynamic properties . Please refer to the explanations provided by Google, as I’m not going to perform this part of the Challenge

Stored XSS via AJAX

AJAX stands for Asynchronous JavaScript And XML. AJAX allows web pages to be updated asynchronously by exchanging data with a web server behind the scenes. This means that it is possible to update parts of a web page, without reloading the whole page

AJAX is not a programming language. It just uses a combination of:

  • A browser built-in XMLHttpRequest object (to request data from a web server)
  • JavaScript and HTML DOM (to display or use the data)

AJAX is quite a misleading name. AJAX applications might use XML to transport data, but it is equally common to transport data as plain text or JSON text

Here below an introduction video with some code examples

Gruyere uses AJAX principles to implement refresh on the home and snippets page

In a real application, refresh would probably happen automatically, but in Gruyere it is made manual. So, the user can be in complete control

When clicking the refresh link, Gruyere fetches feed.gtl which contains refresh data for the current page and then the client-side script uses the browser DOM API (Document Object Model) to insert the new snippets into the page. Here is an introduction video about the DOM, in case of need

Since AJAX runs code on the client side, this script is visible to attackers who do not have access to the source code

Let’s start the Challenge. We are invited to start a curl request on the feed.gtl page

What is curl ? It’s a command-line tool for transferring data specified with URL syntax. Find out how to use curl by reading the curl.1 man page or the MANUAL document. Find out how to install curl by reading the INSTALL document.

libcurl is the library curl is using to do its job. It is readily available to be used by your software. Read the libcurl.3 man page to learn how!

The most important part in the code is this one

feed.gtl

Here is the output of the curl request on the feed.gtl. The result is consistent with the above code

curl on feed.gtl

We see that the corresponding code has been inserted on the client side

Now let’s input the code injection suggested by Google

The JSON output of the curl command reads like this. The code is injected

Now let’s click on the refresh button. There is a pop-up window with the “1” prompt. The snippet reads “all your base” (that means that our script is “invisible“), the rest of the script enables the pop-up window

That’s an evidence of a stored XSS via AJAX. The flaws are both on server side and client side

Server side :

The code below does not include any sanitizer (snippet:html or snippets.0:html)

feed.gtl

As Google says, the text is going to be inserted into the innerHTML of a DOM node so the HTML does have to be sanitized. However, that sanitized text is then going to be inserted into JavaScript and therefore single and double quotes have to be escaped too

lib.js

Client side :

A common use of JSON is to exchange data to/from a web server. When receiving data from a web server, the data is always a string. To become a JavaScript object, we have to parse the data

Gruyere converts the JSON by using JavaScript‘s eval() function : https://bit.ly/3gn4H54

The eval() function evaluates or executes an argument. If the argument is an expression, eval() evaluates the expression. If the argument is one or more Javascript statements, eval() executes the statements

In modern programming eval is used very sparingly. It’s often said that “eval is evil”

The reason is simple : some time ago, JavaScript was a much weaker language, many things could only be done with eval. But that time is over

https://do.co/3szEGV3

Right now, there’s almost no reason to use eval. If someone is using it, there’s a good chance they can replace it with a modern language construct or a JavaScript Module

It is recommended by Google to use the JSON.parse() function : https://bit.ly/2WccjjI

Reflected XSS via AJAX

This “reflected XSS via AJAX” is very close to what we did in the previous paragraph “Stored XSS via AJAX”

Google provides us the scripts to put to the test (the two lines will work in the same way)

The alert box is displayed as usual once we click the refresh button

Contrary to the Stored XSS, there is no code injection on the Server, it’s done directly in the Client

The flaw is in this section of the code

feed.gtl

An HTML Sanitizer should be included to prevent such script to be executed

Python HTML Sanitizer

For a good security, it’s best to rely on a template language and apply a security technology designed for a template system. A self made Sanitizer will most likely not be a good solution

Therefore, let’s mention some well known HTML Sanitizers for Python and the template language Django, as Gruyere is based upon a similar technology with GTL

Bleach, is an excellent HTML Sanitizer, doing all the basic work of a Sanitizer. For correct operation inside a template language such as Django, you will need an extra layer provided by a Django HTML Sanitizer


Client-State Manipulation

We should not trust any user data, the browser on the user machine actually sending this data back to our web server

Elevation of Privilege

Privilege escalation or elevation, can be defined as an attack that involves gaining illicit access of elevated rights, or privileges, beyond what is intended or entitled for a user

This attack can involve an external threat actor or an insider. Privilege escalation is a key stage of the cyberattack chain and typically involves the exploitation of a privilege escalation vulnerability, such as a system bug, misconfiguration, or inadequate access controls

In this Challenge, we are going to elevate our account to administrator, using a specially crafted user input, and taking advantage of some flaws in Gruyere code

We notice some interesting code in the editprofile.gtl, showing the “saveprofile” process

editprofile.gtl

Therefore we can input the following code to our home URL :

/saveprofile?action=update&is_admin=True

We then need to log out and log in to update our session cookie. Then, a link “Manage this server” has appeared in our profile. We can input our name :

Our profile management page now has the admin and author buttons. That means our privileges have been elevated to administrator rights

The flaw is that there are no validations of the above user query on the server side. A user ID without admin rights can place the request to become admin, which should not be possible

Here attached a complete walkthrough of this Challenge

Here below for further reading, some typical access control vulnerabilites and potential mitigations :

https://portswigger.net/web-security/access-control

Cookie Manipulation

A stateless protocol is a communication protocol in which the receiver must not retain the session state from previous requests. The sender transfers relevant session state to the receiver in such a way that every request can be understood in isolation, that is without reference to session state from previous requests retained by the receiver

Examples of stateless protocols include the Internet Protocol (IP), which is the foundation for the Internet, and the Hypertext Transfer Protocol (HTTP), which is the foundation of the World Wide Web

Web server cannot automatically know that two requests are from the same user. For this reason, cookies were invented

When a web site includes a cookie in a HTTP response, the browser automatically sends the cookie back to the browser on the next request. Web sites can use the cookie to save session state

Cookies are usually numeric hashes plaintext variables used by your browser to store that information and communicate it to the server — allowing you to sign in without logging in, because you’ve already been authenticated by your cookie

If cookies authenticate an individual, then if someone else steals that cookie, they can impersonate the person it’s tied to — accessing their account, payment information, and other sensitive details without having to know their username or password

Gruyere uses cookies to remember the identity of the logged in user, in this format [hash|User name|admin|author]

Here is my session cookie visible in Burp Suite :

Set-Cookie: GRUYERE=112960044|Forensicxs|admin|author; path=/486176820694247485940447649923087546312

Now, let’s create a new account with the user name foo|admin|author and let’s see the result in Burp :

Set-Cookie: GRUYERE=19209336|foo|admin|author||author; path=/486176820694247485940447649923087546312

What is a cookie path, and how to define it ? Read below :

https://bit.ly/38kmcyi

So, in our unique session ID, we have been able, under the same path, to trick Gruyere to issue a cookie that looks like the cookie of another user. We have also, as a side effect, been able to perform a privilege escalation as we gained admin rights. By inputing the string (foo|admin|author) into the username field we have successfully created an account which will return a cookie for someone with the username ‘foo’ and with admin rights

The code used to parse cookies on the server-side is tolerant to abnormal cookies — a cookie string with varying characters and lengths will still be read by the server. This means that an attacker doesn’t need to know how cookies are parsed on the server-side to pass a malicious cookie

Here are the security recommendations from Google :

  • The server should escape the username when it constructs the cookie
  • The server should reject a cookie if it doesn’t match the exact pattern it is expecting

We can see in the code below that only basic checks are included at the creation and afterwards

gruyere.py
gruyere.py

Here further reading about Cookie Security : https://bit.ly/3JlQMJu

https://bit.ly/3JlQMJu

Nowadays, Web Application Firewalls (WAF) are putting protections against cookie related attacks, such as this one : https://bit.ly/3jpaZD5

Now, let’s have a closer look to the cookie hash function :

h_data = str(hash(cookie_secret + c_data) & 0x7FFFFFF)

  • cookie_secret : is a static string (which is just ” by default, that means an empty string is the cookie secret), used as initialization vector or salt
  • c_data : is the username
  • & 0x7FFFFFF : AND operator with Hex 0x7FFFFFF
  • str(hash()) : string hashing function
  • h_data : hashed username
Reminder : AND operator &

Python’s hash() is not fit for the purpose – or let’s say “insecure” in this context – because it’s possible to find cryptographic collisions. It is not a bug in Python, it’s just that it is not what it’s designed for in this use case

This hash() function is used in Python’s dictionaries hash tables, where you can’t “afford” a fully secure hash function, because it would slow down so much the calculations and the use of these dictionaries

Here a video explaining in detail these hash tables and collisions

Python does provide secure hash functions in the hashlib module : https://bit.ly/3gK3hlb. They are used each time a cryptographic secure application has to be implemented

Because of it’s lack of cookie protection, Gruyere is also prone to replay attacks : https://bit.ly/2WzSdjQ


Cross-Site Request Forgery (XSRF/CSRF)

Cross-site request forgery (also known as XSRF/CSRF) is a web security vulnerability that allows an attacker to induce users to perform actions that they do not intend to perform. It allows an attacker to partly circumvent the same origin policy, which is designed to prevent different websites from interfering with each other

https://bit.ly/3mJmyHa

Let’s look at the URL used to delete a snippet. For this, let’s actually delete our snippet and check the result in Burp Suite

Burp Suite

We find the GET request : /deletesnippet?index=0

So now we can easily simulate a CSRF attack. We can put the complete URL https://google-gruyere.appspot.com/486176820694247485940447649923087546312/ deletesnippet?index=0 in our Gruyere icon, using the Edit Profile feature

Gruyere : Profile

You can check by yourself, that each time you will put a snippet in your page, and refresh the home page, the snippet will be automatically deleted

To trigger the attack, we could imagine luring a victim to browse a page where this URL is embedded

Now, let’s look into the code. In the Edit Profile form, the code is accepting whatever text we are typing without any check

editprofile.gtl

The Edit Profile code uses the GET Method in the user input forms

editprofile.gtl

When we include deletesnippet?index=0 in the icon form, and after refresh, this triggers an action on the server with the function def _DoDeletesnippet

gruyere.py

The deletion of our snippet is then handled via this code

snippets.gtl

We find that Grueyere has a systematic flaw, which is the use of GET request instead of POST request, for sending and updating sensitive data

GET is used for viewing something, without changing it, while POST is used for changing something. For example, a search page should use GET to get data, while a form that changes your password should use POST. Essentially GET is used to retrieve remote data, and POST is used to insert/update remote data

GET request retrieves a representation of the specified resource and include all required data in the URL. For example :

https://www.example.com/login.php?user=myuser&pass=mypass

POST request is for writing and submit data to be processed (for example from an HTML form) to the identified resource. This may result in the creation of a new resource or the updates of existing resources or both. It may have side effects using the same request several times because this will likely result in multiple writes. Browsers typically give you warnings about this. POST is not fully secure, the data is included in the body of the request instead of the URL but it is still possible to view/edit

Here is a quick summary :

https://bit.ly/3zMhN3w

A first action would be to change the GET to a POST request, as the GET method is not appropriate in this context. But, this will definitely not be sufficient

Here are a set of countermeasures to apply :

  • Enumerate the form values, evaluate that no extraneous fields show up, and sanitize and filter on expected values
  • CSRF tokens help against arbitrary form submission bots

We have already seen sanitizing in the previous chapters. Let’s go deeper in the CSRF tokens

To avoid a CSRF attack, a potential solution is to embed additional authentication data into the HTTP request, so the web application will be able to detect any unauthorized requests crafted by an attacker and placed into a form

CSRF tokens are typically random numbers that are stored in a cookie or on a server. What will happen is the server will compare the token attached to the incoming requests with the value stored in the cookie or the server. If the values are identical, the server will approve the request. Similarly, it will reject the request if the token is missing or is incorrect

Google proposes to pass an action_token in all HTML requests, and use a hash of the value of the user’s cookie appended to a current timestamp (the timestamp in the hash will ensure that old tokens can be expired, which mitigates the risk if it leaks). The POST request will mitigate the risk to pass action_token as a URL parameter and let it leak

Here is the proposed code :

token valid for 24 hours

With such a token, an attacker would also need to guess the token to successfully trick a victim into sending a forged request

For an anti-CSRF mechanism to be effective, it needs to be cryptographically secure. The token cannot be easily guessed, so it cannot be generated based on a predictable pattern

It is recommended to use anti-CSRF options in popular frameworks such as AngularJS (https://bit.ly/3yMmGYO) and refrain from creating own mechanisms

As a last word, you can find here a good summary how to protect your forms from malicious inputs : https://bit.ly/3kRgav7


Cross Site Script Inclusion (XSSI)

XSSI is a client-side attack similar to Cross Site Request Forgery (CSRF) but has a different purpose. Where CSRF uses the authenticated user context to execute certain state-changing actions inside a victim’s page (reset password, etc.), XSSI instead uses JavaScript on the client side to leak sensitive data from authenticated sessions

Principles of an XSSI

Let’s follow the example provided by Google. Here is my private snippet on my home page. This is the “sensitive information” that we are going to leak

my private snippet

We can see that the /feed.gtl discloses informations about our private snippet, as already seen earlier in this article

private snippet in feed.gtl

The following code will take over my private snippet content and display it in an alert text box, using Javascript. The HTTP adress points to my local server 127.0.0.1:8008 and my private session, as I’m running Gruyere locally for this XSSI exploit

Exploit.html

My Exploit.html is located in my root Gruyere directory (resources)

The content of my private snippet is shown in the alert box. That’s our sensitive data leak

script alert

Here below some potential countermeasures :

  • Use a CSRF token (as discussed earlier), to make sure that JSON results containing confidential data are only returned to your own pages
  • JSON response pages should only support POST requests, which prevents the script from being loaded via a script tag
  • Make sure that the script is not executable (with sanitizing). The standard way of doing this is to append some non-executable prefix to it. A script running in the same domain can read the contents of the response and strip out the prefix, but scripts running in other domains can’t

Here are more XSSI examples on the OWASP site : https://bit.ly/3kTbBAu

You can also check this video from blackhat Europe that provides more explanations


Path Trasversal

Information disclosure via path travsersal

A common attacker technique is Path Traversal to access files outside of the intended directory : https://bit.ly/3BKWuzF

An attacker may be able to read an unintended file, resulting in information disclosure of sensitive data. Or, an attacker may be able to write to an unintended file, resulting in unauthorized modification of sensitive data or compromising the server’s security

Modern web applications and web servers usually contain quite a bit of information in addition to the standard HTML and CSS, including scripts, images, templates, and configuration files. A web server typically restricts the user from accessing anything higher than the root directory, or web document root, on the server’s file system

A secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. Using a secret means that you don’t need to include confidential data in your application code

A Path Trasversal attack will target stored secrets, among other things

https://bit.ly/3yKphTm

Let’s start Burp Suite and check my Gruyere session site map. For this the easiest is to use the Burp browser (inside the App), as it will manage the proxy for you, and intercept the HTTP requests. Here is the result :

site map in Burp

We can see that there is a secret.txt file, that’s our target. Let’s see if Gruyere is sensitive to Path Trasversal attacks

Let’s try with the upload.gtl module. Normally we should not be able to access the code…but we can !

Here is the result when we enter /upload.gtl (expected behaviour)

Let’s go down a level and type in /upload.gtl/test, the result is as follows, as this file “test” does not exist in this hierarchy

Now, let’s move one level back with /upload/test/../, and here is the result. This confirms that Gruyere is vulnerable to Path Trasversal

Now let’s find the content of the secret.txt file. I’m on Chrome, so this may not work exactly the same way on your browser. Chrome does not accept the plain ../ command, but we can easily trick the browser with the hexadecimal translation of the slash / into 0x2f or %2f

Let’s type in /secret.txt. This returns the error message as above

Now, let’s step up with the path /..%2fsecret.txt. We find the content of the secret.txt file is Cookie!

An additional note : here is a usefull reminder about commands to move in a file system

https://red.ht/3kVUAFK

Data tampering via path trasversal

Now that we have found that Gruyere is vulnerable to path trasversal, we can easily craft a data tampering attack, by changing the content of the secret.txt file. I have chosen “Path Trasversal” in my file. First, we must upload the file in our session

Now, let’s repeat the actions from the previous paragraph, to launch the path trasversal attack :

/secret.txt

/..%2fsecret.txt

We find that the secret.txt file has been replaced by the new one

Let’s conclude this chapter by a few countermeasures

  • don’t store sensitive files on your web server. The only files that should be in your document root folder are those that are needed for the site to function properly
  • make sure you’re running the latest versions of your web server
  • sanitize any user input. Remove everything but the known good data and filter meta characters from the user input. This will ensure that attackers cannot use commands that try to escape the root directory or violate other access privileges
  • remove “..” and “../” from any input that is used in a file context
  • ensure that your web server is properly configured to allow public access to only those directories that are needed for the site to function

Denial of Service

Here we will try some tricks to prevent the Gruyere server from servicing requests, by taking advantage of some server code bugs

DoS – Quit the Server

As we are logged in as admin (from previous privilege escalation achieved in this article), let’s check how to request a server quit command. This is to be found in the “Manage this server” section

We find in the address bar that it is handled by the manage.gtl

manage.gtl

Now let’s create a new account without admin rights. We notice that it is still easy to ask the Gruyere server to quit. Just type in /quitserver

Let’s check how we can achieve this while we are not logged in as admin. The key question is how Gruyere is preventing this query to achieve its goal

In fact, Gruyere does include some so called “Protected URLs” in the server code

gruyere.py

What is this ? A website is, in general, available to the public. But there could be a need to have a seperate area that is NOT available to the public. That’s where the Protected URL comes in. It allows to make a certain directory of a site not available to the public, and instead, prompt the visitor for a username and password

Let’s solve the bug by adding /quitserver to the Protected URLs

gruyere.py

Because of the below code, Gruyere will send back an “invalid request” message when you try to type in a protected URL

gruyere.py

DoS – Overloading the Server

We need to find a way to overload the server when a request is processed. For this, we have seen that Gruyere is vulnerable to Path Trasversal attacks

To overload the server, one idea is to use a resource that will put Gruyere in a kind of “infinite loop”, with a request repeating without end, whatever we click in our session

We see that the menubar.gtl file is in every page we navigate, so this makes a good candidate for this attack

menubar.gtl

We can create a file named menubar.gtl, that will be replacing the existing one, with the following content

[[include:menubar.gtl]]DoS[[/include:menubar.gtl]]

We can upload and replace the existing menubar.gtl, using a Path Trasversal attack. We can create a new user called ../resources, then upload the file using this user profile. This will implement the attack on the resources directory and copy-paste the new file in there

Here is the result, this loop repeats itself, each time we perform a refresh or navigate in the site

We need to use the “reset button” to stop this loop

https://google-gruyere.appspot.com/resetbutton/session ID

The potential fix has been described earlier in the Path Trasversal section


Code Execution

Google tells us to use two previous exploits to execute code. We will therefore use Path Trasversal and Denial of Service

The general idea here is to take advantage of these vulnerabilities, to attack the Gruyere infrastructure. How to do that ? The GTL template language is a target of choice, as GTL is shaping the entire Gruyere web site. Modifying the GTL language can permanently alter the site and put it down. We will leverage this attack using the Path Trasversal and Denial of Service

We are therefore going to replace the “gtl.py” file with our own, and “rewrite” the site’s infrastructure and thus “own” the application

The content of the GTL file can be anything. I just wrote in an empty file “Code Execution Challange” and named it gtl.py

Then I prepared the Path Trasversal attack by creating the user .., and uploading my file in this profile

Then, I restart the server by typing /quitserver. Gruyer puts the following message : the server has been 0wnd! I have found a way to attack the infrastructure and “own” the server, by replacing the gtl.py file

There are several flaws in Gruyer code :

  • gruyere allows users to upload a file with the .py extension (Python file). This should be blocked by a proper sanitization, as seen previously
  • gruyere should be modified for path trasversal flaws, as seen above
  • gruyere has permission to both read and write files in the gruyere directory. This should not be the case, and gruyere should run minimal privileges (https://bit.ly/3typbgB)
  • more generally and for a real world site, your infrastructure should be updated (such as libraries imported by your code), to avoid typical vulnerabilities

In the Gruyere code, the following code section should be modified to restrict the possibilities to write file

gruyere.py

Configuration Vulnerabilities

We are going to try leaking datas stored in the Gruyere database. This is a cool challenge as database exposures are such a big thing nowadays

Information disclosure #1

Looking into the file system of Gruyere, we notice an interesting and potentially sensitive file which is dump.gtl

Here is the code

dump.gtl

It really looks like a database dump program. What is this ?

Usually, a database dump contains a record of the table structure and/or the data from a database, and in real life, is usually in the form of a list of SQL statements

A database dump is most often used for backing up a database so that its contents can be restored in the event of data loss. The database program will allow to extract the database data for backup. It can also be used in situations where you need to debug the server

To access the dump, just type this address in your session : /dump.gtl

Here is the content of the dump, consistent with the above code

We can see user names and passwords in clear text

There are obviously major flaws here :

  • first of all, this file should not be stored here
  • passwords are not hashed or encrypted, but stored in clear text
  • the dump program should be strictly restricted to admin rights
  • the dump files should be stored in a specific location with a specific access mechanism (IP, port, authentication)

Here are the OWASP recommendations for database security : https://bit.ly/2XtsOsg. Obviously Gruyere did not implement any of these recommendations

To build further awareness of potential exploitation of debuggers in real life, I suggest to check this link : https://bit.ly/2VNQxCO

Are such situations common in real life ?

Unfortunately, Gruyere is quite representative of what you can find on websites exposed to the Internet. We all know that there are thousands of databases which are ill-configured and exposed to attackers

You can have a first look by typing this query in Google : intitle:”index of/” “*.sql”

This will just search for SQL databases in websites root directory, and reveal many exposed databases (some being quite critical from a GDPR point of view)

Information disclosure #2

Unfortunately, deleting the dump.gtl file will not secure Gruyere. One big issue, as we know, is that we can easily upload any kind of file we want. So, we can upload another dump script and leak the data

There should be protections included in the code (preventing some file formats such as templates, scripts,…). The following code on the server side permits unrestricted file uploads

gruyere.py

Here is a valuable checklist focused on file upload vulnerabilities : https://bit.ly/3Crw0E, and also a good video about file upload vulnerability, providing a quite thorough review of the topic

Information disclosure #3

The target here is to continue leaking the Gruyere database, not using a dump function or upload vulnerability, but relying on the Gruyer code weaknesses

We can try using some functions existing in Python, such as pprint – data pretty printer, to display the database content, and inject this directly into the “new snippet” window

The pprint module provides a capability to “pretty-print” Python data structures. The formatted representation keeps objects on a single line if it can, and breaks them onto multiple lines if they don’t fit within the allowed width

In a template language such as Django, variables look like this : {{ variable }}

When the template engine encounters a variable, it evaluates that variable and replaces it with the result

Reading into the GTL language, we see some explanatations : “db” stands for the database variable in the GTL language and can be used as a special value

gtl.py

So, we can input this variable in the “new snippet” window, as the GTL language will interpret the result

{{_db:pprint}}

Here is our database directly on the “my snippet” page

There is one flaw in the way the template code parses the variable values

ExpandTemplate calls _ExpandBlocks followed by _ExpandVariables

gtl.py

_ExpandBlock calls ExpandTemplate on nested blocks. So if a variable is expanded inside a nested block and contains something that looks like a variable template, it will get expanded a second time

In addition to this design flaw, the template language should not allow arbitrary database access and should narrow down the queries possibilities


AJAX vulnerabilities

Before starting these last two challenges, let’s remind what we have seen earlier

Gruyere uses AJAX principles to implement refresh on the home and snippets page

When clicking the refresh link, Gruyere fetches feed.gtl which contains refresh data for the current page and then the client-side script uses the browser DOM API (Document Object Model) to insert the new snippets into the page

Since AJAX runs code on the client side, this script is visible to attackers who do not have access to the source code. We can see the code using Burp Suite

DoS via AJAX

First of all, let’s sign in using my Gruyere account “Forensicxs”

Forensicxs

We can see the snippets. Clicking on “refresh”, we see the response corresponding to the snippets content

Forensicxs

Then, let’s create a user “private_snippet“, and create several snippets

Here is the response. The snippets of the other users have been deleted

private_snippet

The flaw here is the structure of the response. We see the construction here

showprofile.gtl

And also in the lib.js

lib.js

This manipulation of the Document Object Model, by injecting code and pushing a data “offset”, is somehow similar to a buffer overflow

Google says that the structure of the response should be as follows, to avoid this “offset” of the user snippets by the attacker snippets :

[<private_snippet>, {<user> : <snippet>,…}]

Phishing via AJAX

The target here is to inject in the page some links to a phishing site

I created a user called “Phishing”, and I created a snippet with the following text

<a href='https://www.forensicxs.com'>Sign in</a>
| <a href='https://www.forensicxs.com'>Sign up</a>

Here is the result

So now, we have on the page additional links to sign in/sign up. A user could be tricked to click on such links and trigger a phishing attack, by forwarding the user to a specially crafted page looking like the Gruyere page, and including some malicious code to take control of the user session

We have seen in these two challenges, that the DOM should be better protected against potential manipulations, for example, by applying a prefix to user values like id="user_"

home.gtl

Conclusion

We have seen most of the major web hacking techniques in this article. For learning, Google Gruyere is really a very good platform, as it combines the client side application, but also the server code, together with a well documented walkthrough

I hope this article provides you further help for a good understanding. In any case, thanks to the Google team for providing this excellent learning platform