Cross Site Scripting (XSS) Sanitization/Validation process concepts in multiple languages including Java, Python, JavaScript, C#.NET, GoLang or GO, RUST, PHP, and C++

Sep 17

This post is an attempt to share foundational XSS Sanitization/Validation process concepts in multiple languages including Java, Python, JavaScript, C#.NET, GoLang or GO, RUST, PHP, and C++. The examples below are meant to be a starting point to help demonstrate the concept of input sanitization to protect against cross site scripting while allowing “safe” elements of HTML, please note that they are not complete examples but the concepts demonstrated could be used to help build complete sanitization functions in Java, Python, JavaScript, C#.NET, GoLang or GO, RUST, PHP, and C++:

Java:

To sanitize an input string for Cross-Site Scripting (XSS) in Java while allowing safe HTML elements, you can use a library like OWASP's Java HTML Sanitizer. Here's a Java function that utilizes the OWASP Java HTML Sanitizer library to sanitize input:

First, you need to include the OWASP Java HTML Sanitizer library in your project. You can download the JAR file from the official OWASP GitHub repository or add it as a dependency if you're using a build tool like Maven or Gradle.

Here's an example of how to create a Java function that sanitizes input using the OWASP Java HTML Sanitizer:

```java

import org.owasp.html.PolicyFactory;

import org.owasp.html.Sanitizers;

public class XssSanitizer {

public static String sanitizeInput(String input) {

// Define a policy that allows only safe HTML elements and attributes

PolicyFactory policy = Sanitizers.FORMATTING.and(Sanitizers.BLOCKS)

.and(Sanitizers.IMAGES).and(Sanitizers.LINKS);

// Sanitize the input string

String sanitizedInput = policy.sanitize(input);

return sanitizedInput;

}

public static void main(String[] args) {

String userInput = "<a href='https://example.com' onclick='alert(\"XSS Attack\")'>Click here</a>";

String sanitizedInput = sanitizeInput(userInput);

System.out.println(sanitizedInput);

}

```

In this example, we use the `PolicyFactory` and `Sanitizers` classes from the OWASP Java HTML Sanitizer library to define a policy that allows safe HTML elements and attributes. You can customize the `PolicyFactory` to include or exclude specific elements and attributes as needed.

Make sure to add the OWASP Java HTML Sanitizer library to your project's classpath for this code to work. You can find the library on the OWASP GitHub repository or include it as a Maven or Gradle dependency if you're using a build tool.

Python 3:

To sanitize an input string for Cross-Site Scripting (XSS) while allowing safe HTML elements, you can use a library like `bleach`, which provides fine-grained control over which HTML elements and attributes are allowed. Here's a Python 3 function that utilizes `bleach` for this purpose:

First, you need to install the `bleach` library if you haven't already:

```bash

pip install bleach

```

Now, you can create the function:

```python

import bleach

def sanitize_input(input_string):

"""

Sanitize an input string to prevent Cross-Site Scripting (XSS) attacks while allowing safe HTML elements and attributes.

Args:

input_string (str): The input string to be sanitized.

Returns:

str: The sanitized input string.

"""

# Define the list of allowed HTML tags and attributes

allowed_tags = [

'a', 'abbr', 'acronym', 'b', 'blockquote', 'code', 'em', 'i', 'li',

'ol', 'strong', 'ul', 'p', 'br', 'div', 'span', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6'

]

# Define a dictionary of allowed attributes for each tag

allowed_attributes = {

'a': ['href', 'title'],

'abbr': ['title'],

'acronym': ['title'],

}

# Sanitize the input string

sanitized_string = bleach.clean(input_string, tags=allowed_tags, attributes=allowed_attributes)

return sanitized_string

```

This function allows only the specified HTML tags and attributes while removing any potentially dangerous elements. You can customize the `allowed_tags` and `allowed_attributes` lists to meet your specific requirements.

Example usage:

```python

user_input = '<a href="https://example.com" onclick="alert(\'XSS Attack\')">Click here</a>'

sanitized_input = sanitize_input(user_input)

print(sanitized_input)

```

The output will be:

```html

<a href="https://example.com">Click here</a>

```

This sanitized string allows safe HTML elements (`<a>`) and attributes (`href`) while removing the `onclick` attribute that could potentially lead to an XSS attack.

JAVASCRIPT:

To sanitize an input string for Cross-Site Scripting (XSS) in JavaScript while allowing safe HTML elements, you can use a library like DOMPurify. Here's a JavaScript function that utilizes DOMPurify to sanitize input:

First, you need to include the DOMPurify library in your project. You can include it directly from a CDN or install it using npm if you're working in a Node.js environment.

Here's an example of how to create a JavaScript function that sanitizes input using DOMPurify:

```html

function sanitizeInput(input) {

// Create a DOMPurify configuration with allowed HTML elements and attributes

const config = {

ALLOWED_TAGS: ['a', 'abbr', 'acronym', 'b', 'blockquote', 'code', 'em', 'i', 'li', 'ol', 'strong', 'ul', 'p', 'br', 'div', 'span', 'h1', 'h2', 'h3', 'h4', 'h5', 'h6'],

ALLOWED_ATTR: ['href', 'title']

};

// Sanitize the input string using DOMPurify

const sanitizedInput = DOMPurify.sanitize(input, config);

return sanitizedInput;

}

// Example usage

const userInput = '<a href="https://example.com" onclick="alert(\'XSS Attack\')">Click here</a>';

const sanitizedInput = sanitizeInput(userInput);

console.log(sanitizedInput);

</script>

```

In this example, we include the DOMPurify library from a CDN and define a configuration object (`config`) that specifies the allowed HTML elements and attributes. You can customize the `ALLOWED_TAGS` and `ALLOWED_ATTR` arrays to include or exclude specific elements and attributes as needed.

The `DOMPurify.sanitize` function is used to sanitize the input string according to the specified configuration, and the sanitized input is returned.

Make sure to include the DOMPurify library in your HTML file for this code to work. You can also install it using npm if you are working in a Node.js environment.

C#:

To sanitize an input string for Cross-Site Scripting (XSS) in C# while allowing safe HTML elements, you can use the `HtmlSanitizer` library. Here's a C# function that utilizes the `HtmlSanitizer` library to sanitize input:

First, you need to install the `HtmlSanitizer` NuGet package in your project:

```bash

Install-Package HtmlSanitizer

```

Now, you can create a C# function to sanitize input using the `HtmlSanitizer` library:

```csharp

using Ganss.XSS; // Make sure to include the appropriate using directive

public class XssSanitizer

{

public static string SanitizeInput(string input)

{

// Create an instance of the HtmlSanitizer

var sanitizer = new HtmlSanitizer();

// Define a policy that allows safe HTML elements and attributes

sanitizer.AllowedTags.Clear(); // Remove all tags (default)

sanitizer.AllowedTags.Add("a"); // Allow <a> tags

sanitizer.AllowedAttributes.Clear(); // Remove all attributes (default)

sanitizer.AllowedAttributes.Add("href"); // Allow the href attribute for <a> tags

// Sanitize the input string using the defined policy

string sanitizedInput = sanitizer.Sanitize(input);

return sanitizedInput;

}

public static void Main(string[] args)

{

string userInput = "<a href='https://example.com' onclick='alert(\"XSS Attack\")'>Click here</a>";

string sanitizedInput = SanitizeInput(userInput);

Console.WriteLine(sanitizedInput);

}

```

In this example, we use the `HtmlSanitizer` library and create an instance of it. We then define a policy that allows safe HTML elements (in this case, just `<a>`) and attributes (e.g., `href`) by clearing the default allowed tags and attributes and adding the ones we want to allow.

The `SanitizeInput` function uses this policy to sanitize the input string, removing any disallowed elements and attributes.

Make sure to install the `HtmlSanitizer` package in your project and include the appropriate `using` directive (`using Ganss.XSS;`) for this code to work.

To sanitize an input string for Cross-Site Scripting (XSS) in Go while allowing safe HTML elements, you can use a library like `bluemonday`. Here's a Go function that utilizes `bluemonday` to sanitize input:

First, you need to install the `bluemonday` package if you haven't already:

```bash

go get github.com/microcosm-cc/bluemonday

```

Now, you can create a Go function to sanitize input using `bluemonday`:

```go

package main

import (

"fmt"

"github.com/microcosm-cc/bluemonday"

)

func SanitizeInput(input string) string {

// Create a new policy instance

p := bluemonday.UGCPolicy()

// Allow safe HTML elements and attributes

p.AllowElements("a").

AllowAttrs("href").OnElements("a")

// Sanitize the input string using the policy

sanitizedInput := p.Sanitize(input)

return sanitizedInput

}

func main() {

userInput := `<a href="https://example.com" onclick="alert('XSS Attack')">Click here</a>`

sanitizedInput := SanitizeInput(userInput)

fmt.Println(sanitizedInput)

}

```

In this example, we import the `bluemonday` library and create a new policy instance (`p`). We then use the policy to allow specific HTML elements (e.g., `<a>`) and attributes (e.g., `href`) by chaining the `AllowElements` and `AllowAttrs` methods.

The `SanitizeInput` function uses this policy to sanitize the input string, removing any disallowed elements and attributes.

Make sure to include the `github.com/microcosm-cc/bluemonday` package in your Go project for this code to work.

RUST
To sanitize an input string for Cross-Site Scripting (XSS) in Rust while allowing safe HTML elements, you can use the `rust-html-sanitizer` library. Here's a Rust function that utilizes `rust-html-sanitizer` to sanitize input:

First, add `rust-html-sanitizer` to your `Cargo.toml` file:

```toml

[dependencies]

rust-html-sanitizer = "0.1"

```

Now, you can create a Rust function to sanitize input using `rust-html-sanitizer`:

```rust

extern crate rust_html_sanitizer;

use rust_html_sanitizer::{Policy, Sanitize};

fn sanitize_input(input: &str) -> String {

// Create a new policy instance

let mut policy = Policy::new();

// Allow safe HTML elements and attributes

policy.allow_elements(vec!["a"]);

policy.allow_attributes("a", vec!["href"]);

// Sanitize the input string using the policy

let sanitized_input = policy.sanitize(input);

sanitized_input.to_string()

}

fn main() {

let user_input = "<a href='https://example.com' onclick='alert(\"XSS Attack\")'>Click here</a>";

let sanitized_input = sanitize_input(user_input);

println!("{}", sanitized_input);

}

```

In this example, we add `rust-html-sanitizer` as a dependency and create a new policy instance using `Policy::new()`. We then use the policy to allow specific HTML elements (e.g., `<a>`) and attributes (e.g., `href`) using the `allow_elements` and `allow_attributes` methods.

The `sanitize_input` function uses this policy to sanitize the input string, removing any disallowed elements and attributes.

Make sure to include `rust-html-sanitizer` in your `Cargo.toml` and `extern crate rust_html_sanitizer;` in your Rust project for this code to work.

PHP

To sanitize an input string for Cross-Site Scripting (XSS) in PHP while allowing safe HTML elements, you can use the `HTML Purifier` library. Here's a PHP function that utilizes `HTML Purifier` to sanitize input:

First, you need to install the `HTML Purifier` library if you haven't already. You can download it from the official website or install it using Composer.

Here's an example of how to create a PHP function to sanitize input using `HTML Purifier`:

```php

require_once 'path/to/HTMLPurifier/library/HTMLPurifier.auto.php'; // Include the HTML Purifier library

function sanitizeInput($input) {

// Create a configuration instance

$config = HTMLPurifier_Config::createDefault();

// Define a policy that allows safe HTML elements and attributes

$config->set('HTML.Allowed', 'a[href|title],abbr,acronym,b,blockquote,code,em,i,li,ol,strong,ul,p,br,div,span,h1,h2,h3,h4,h5,h6');

// Create the HTML Purifier instance

$purifier = new HTMLPurifier($config);

// Sanitize the input string using the defined policy

$sanitizedInput = $purifier->purify($input);

return $sanitizedInput;

}

// Example usage

$userInput = '<a href="https://example.com" onclick="alert(\'XSS Attack\')">Click here</a>';

$sanitizedInput = sanitizeInput($userInput);

echo $sanitizedInput;

```

In this example, we include the HTML Purifier library by requiring its auto-load file. We then create a configuration instance and define a policy using the `set('HTML.Allowed', ...)` method. The policy allows specific HTML elements and attributes, and you can customize the list as needed.

The `sanitizeInput` function uses this policy to sanitize the input string, removing any disallowed elements and attributes.

Make sure to include the correct path to the HTML Purifier library's auto-load file in your PHP script for this code to work. You can also install it using Composer if you prefer.

RUBY

To sanitize an input string for Cross-Site Scripting (XSS) in Ruby while allowing safe HTML elements, you can use the `Loofah` gem. Here's a Ruby function that utilizes `Loofah` to sanitize input:

First, you need to add the `loofah` gem to your `Gemfile` and run `bundle install`:

```ruby

gem 'loofah'

```

Now, you can create a Ruby function to sanitize input using `Loofah`:

```ruby

require 'loofah'

def sanitize_input(input)

# Create a Loofah scrubber with a whitelist of allowed elements and attributes

scrubber = Loofah::Scrubber.new do

elements :a, :abbr, :acronym, :b, :blockquote, :code, :em, :i, :li, :ol, :strong, :ul, :p, :br, :div, :span, :h1, :h2, :h3, :h4, :h5, :h6

attributes :a => [:href, :title]

end

# Use the scrubber to sanitize the input

sanitized_input = Loofah.scrub_fragment(input, scrubber).to_s

return sanitized_input

end

# Example usage

user_input = '<a href="https://example.com" onclick="alert(\'XSS Attack\')">Click here</a>'

sanitized_input = sanitize_input(user_input)

puts sanitized_input

```

In this example, we require the `loofah` gem and create a `Loofah::Scrubber` instance that defines a whitelist of allowed HTML elements and attributes. The `sanitize_input` function uses this scrubber to sanitize the input string, removing any disallowed elements and attributes.

Make sure to include the `loofah` gem in your project, and you can run the script to sanitize input strings while allowing safe HTML elements.

C++

C++ does not have a built-in library specifically designed for HTML sanitization. To create a C++ function for sanitizing an input string for Cross-Site Scripting (XSS) while allowing safe HTML elements, you would typically need to manually parse and filter the HTML, which can be a complex and error-prone task. For this reason, using a specialized library or a language with better HTML processing capabilities (like Python or Ruby) is often a more practical choice.

However, if you still want to perform this task in C++, you can explore the possibility of using third-party libraries like Gumbo or libtidy for HTML parsing and filtering. Here's a simplified example using Gumbo, but please note that this is a basic example, and more advanced filtering would be required for a production-level application:

```cpp

#include <iostream>

#include <string>

#include <gumbo.h>

// Callback function to recursively sanitize HTML nodes

void sanitizeNode(GumboNode* node) {

if (node->type == GUMBO_NODE_ELEMENT) {

// Filter allowed tags and attributes here

// For simplicity, we will allow only <a> tags and the href attribute

if (node->v.element.tag != GUMBO_TAG_A) {

GumboVector* attributes = &node->v.element.attributes;

for (unsigned int i = 0; i < attributes->length; ++i) {

GumboAttribute* attribute = static_cast<GumboAttribute*>(attributes->data[i]);

gumbo_vector_remove(attributes, i);

i--;

gumbo_attribute_destroy(attribute);

}

if (node->type == GUMBO_NODE_ELEMENT || node->type == GUMBO_NODE_DOCUMENT) {

GumboVector* children = &node->v.element.children;

for (unsigned int i = 0; i < children->length; ++i) {

sanitizeNode(static_cast<GumboNode*>(children->data[i]));

}

std::string sanitizeInput(const std::string& input) {

GumboOutput* output = gumbo_parse(input.c_str());

sanitizeNode(output->root);

std::string sanitizedInput(gumbo_normalized_text(output->root));

gumbo_destroy_output(&kGumboDefaultOptions, output);

return sanitizedInput;

}

int main() {

std::string userInput = "<a href='https://example.com' onclick='alert(\"XSS Attack\")'>Click here</a>";

std::string sanitizedInput = sanitizeInput(userInput);

std::cout << sanitizedInput << std::endl;

return 0;

}

```

In this example, we use the Gumbo library for parsing HTML and a simple filtering logic to allow only `<a>` tags and the `href` attribute while removing other attributes and tags. Please note that this is a basic illustration, and a production-level solution would require more extensive filtering and validation of HTML content to protect against XSS attacks.

References:

Patrick Kelly

Cross Site Scripting (XSS) Sanitization/Validation process concepts in multiple languages including Java, Python, JavaScript, C#.NET, GoLang or GO, RUST, PHP, and C++

Path traversal remediation function examples in multiple languages: Java, Python, .Net, JavaScript, Go, Rust, Ruby, PHP, and C++

Building a WRAP | an Interdependent Runtime Protection(or runtime application self-protection / RASP) | Web Application Firewall(WAF) Solution

Gratitech