7.7 CVE-2024-52595

 

lxml_html_clean is a project for HTML cleaning functionalities copied from `lxml.html.clean`. Prior to version 0.4.0, the HTML Parser in lxml does not properly handle context-switching for special HTML tags such as `<svg>`, `<math>` and `<noscript>`. This behavior deviates from how web browsers parse and interpret such tags. Specifically, content in CSS comments is ignored by lxml_html_clean but may be interpreted differently by web browsers, enabling malicious scripts to bypass the cleaning process. This vulnerability could lead to Cross-Site Scripting (XSS) attacks, compromising the security of users relying on lxml_html_clean in default configuration for sanitizing untrusted HTML content. Users employing the HTML cleaner in a security-sensitive context should upgrade to lxml 0.4.0, which addresses this issue. As a temporary mitigation, users can configure lxml_html_clean with the following settings to prevent the exploitation of this vulnerability. Via `remove_tags`, one may specify tags to remove - their content is moved to their parents' tags. Via `kill_tags`, one may specify tags to be removed completely. Via `allow_tags`, one may restrict the set of permissible tags, excluding context-switching tags like `<svg>`, `<math>` and `<noscript>`.
https://nvd.nist.gov/vuln/detail/CVE-2024-52595

Categories

CWE-184 : Incomplete List of Disallowed Inputs
The product implements a protection mechanism that relies on a list of inputs (or properties of inputs) that are not allowed by policy or otherwise require other action to neutralize before additional processing takes place, but the list is incomplete. This is used by CWE and CAPEC instead of other commonly-used terms. Its counterpart is allowlist. This is often used by security tools such as firewalls, email or web gateways, proxies, etc. This term is frequently used, but usage has been declining as organizations have started to adopt other terms. Exploitation of a vulnerability with commonly-used manipulations might fail, but minor variations might succeed. Do not rely exclusively on detecting disallowed inputs. There are too many variants to encode a character, especially when different environments are used, so there is a high likelihood of missing some variants. Only use detection of disallowed inputs as a mechanism for detecting suspicious activity. Ensure that you are using other protection mechanisms that only identify "good" input - such as lists of allowed inputs - and ensure that you are properly encoding your outputs. Chain: API for text generation using Large Language Models (LLMs) doesnot include the "\" Windows folder separator in its denylist (CWE-184)when attempting to prevent Local File Inclusion via path traversal(CWE-22), allowing deletion of arbitrary files on Windows systems. product uses a denylist to identify potentially dangerous content, allowing attacker to bypass a warning PHP remote file inclusion in web application that filters "http" and "https" URLs, but not "ftp". Programming language does not filter certain shell metacharacters in Windows environment. XSS filter doesn't filter null characters before looking for dangerous tags, which are ignored by web browsers. MIE and validate-before-cleanse. Web-based mail product doesn't restrict dangerous extensions such as ASPX on a web server, even though others are prohibited. Resultant XSS when only <script> and <style> are checked. Privileged program does not clear sensitive environment variables that are used by bash. Overlaps multiple interpretation error. SQL injection protection scheme does not quote the "\" special character. Detection of risky filename extensions prevents users from automatically executing .EXE files, but .LNK is accepted, allowing resultant Windows symbolic link. Product uses list of protected variables, but accidentally omits one dangerous variable, allowing external modification Chain: product only removes SCRIPT tags (CWE-184), enabling XSS (CWE-79) Chain: product only checks for use of "javascript:" tag (CWE-184), allowing XSS (CWE-79) using other tags Chain: OS command injection (CWE-78) enabled by using an unexpected character that is not explicitly disallowed (CWE-184) "\" not in list of disallowed values for web server, allowing path traversal attacks when the server is run on Windows and other OSes.

References


 

CPE

cpe start end


REMEDIATION




EXPLOITS


Exploit-db.com

id description date
No known exploits

POC Github

Url
No known exploits

Other Nist (github, ...)

Url
No known exploits


CAPEC


Common Attack Pattern Enumerations and Classifications

id description severity
120 Double Encoding
Medium
15 Command Delimiters
High
182 Flash Injection
Medium
3 Using Leading 'Ghost' Character Sequences to Bypass Input Filters
Medium
43 Exploiting Multiple Input Interpretation Layers
High
6 Argument Injection
High
71 Using Unicode Encoding to Bypass Validation Logic
High
73 User-Controlled Filename
High
85 AJAX Footprinting
Low