gsd-2024-0243
Vulnerability from gsd
Modified
2024-01-05 06:02
Details
With the following crawler configuration: ```python from bs4 import BeautifulSoup as Soup url = "https://example.com" loader = RecursiveUrlLoader( url=url, max_depth=2, extractor=lambda x: Soup(x, "html.parser").text ) docs = loader.load() ``` An attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like "https://example.completely.different/my_file.html" and the crawler would proceed to download that file as well even though `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resolved in https://github.com/langchain-ai/langchain/pull/15559
Aliases



{
   gsd: {
      metadata: {
         exploitCode: "unknown",
         remediation: "unknown",
         reportConfidence: "confirmed",
         type: "vulnerability",
      },
      osvSchema: {
         aliases: [
            "CVE-2024-0243",
         ],
         details: "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559",
         id: "GSD-2024-0243",
         modified: "2024-01-05T06:02:19.749927Z",
         schema_version: "1.4.0",
      },
   },
   namespaces: {
      "cve.org": {
         CVE_data_meta: {
            ASSIGNER: "security@huntr.com",
            ID: "CVE-2024-0243",
            STATE: "PUBLIC",
         },
         affects: {
            vendor: {
               vendor_data: [
                  {
                     product: {
                        product_data: [
                           {
                              product_name: "langchain-ai/langchain",
                              version: {
                                 version_data: [
                                    {
                                       version_affected: "<",
                                       version_name: "unspecified",
                                       version_value: "0.1.0",
                                    },
                                 ],
                              },
                           },
                        ],
                     },
                     vendor_name: "langchain-ai",
                  },
               ],
            },
         },
         data_format: "MITRE",
         data_type: "CVE",
         data_version: "4.0",
         description: {
            description_data: [
               {
                  lang: "eng",
                  value: "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559",
               },
            ],
         },
         impact: {
            cvss: [
               {
                  attackComplexity: "HIGH",
                  attackVector: "LOCAL",
                  availabilityImpact: "NONE",
                  baseScore: 3.7,
                  baseSeverity: "LOW",
                  confidentialityImpact: "LOW",
                  integrityImpact: "LOW",
                  privilegesRequired: "HIGH",
                  scope: "CHANGED",
                  userInteraction: "REQUIRED",
                  vectorString: "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N",
                  version: "3.0",
               },
            ],
         },
         problemtype: {
            problemtype_data: [
               {
                  description: [
                     {
                        cweId: "CWE-918",
                        lang: "eng",
                        value: "CWE-918 Server-Side Request Forgery (SSRF)",
                     },
                  ],
               },
            ],
         },
         references: {
            reference_data: [
               {
                  name: "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861",
                  refsource: "MISC",
                  url: "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861",
               },
               {
                  name: "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22",
                  refsource: "MISC",
                  url: "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22",
               },
               {
                  name: "https://github.com/langchain-ai/langchain/pull/15559",
                  refsource: "MISC",
                  url: "https://github.com/langchain-ai/langchain/pull/15559",
               },
            ],
         },
         source: {
            advisory: "370904e7-10ac-40a4-a8d4-e2d16e1ca861",
            discovery: "EXTERNAL",
         },
      },
      "nvd.nist.gov": {
         cve: {
            descriptions: [
               {
                  lang: "en",
                  value: "With the following crawler configuration:\n\n```python\nfrom bs4 import BeautifulSoup as Soup\n\nurl = \"https://example.com\"\nloader = RecursiveUrlLoader(\n    url=url, max_depth=2, extractor=lambda x: Soup(x, \"html.parser\").text\n)\ndocs = loader.load()\n```\n\nAn attacker in control of the contents of `https://example.com` could place a malicious HTML file in there with links like \"https://example.completely.different/my_file.html\" and the crawler would proceed to download that file as well even though `prevent_outside=True`.\n\nhttps://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51\n\nResolved in https://github.com/langchain-ai/langchain/pull/15559",
               },
               {
                  lang: "es",
                  value: "Con la siguiente configuración del rastreador: ```python de bs4 import BeautifulSoup as Soup url = \"https://example.com\" loader = RecursiveUrlLoader( url=url, max_ Depth=2, extractor=lambda x: Soup(x, \"html .parser\").text ) docs = loader.load() ``` Un atacante que controle el contenido de `https://example.com` podría colocar un archivo HTML malicioso allí con enlaces como \"https:/example.completely.different/my_file.html\" y el rastreador procedería a descargar ese archivo también aunque `prevent_outside=True`. https://github.com/langchain-ai/langchain/blob/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22/libs/community/langchain_community/document_loaders/recursive_url_loader.py#L51-L51 Resuelto en https://github.com/langchain-ai/langchain/pull /15559",
               },
            ],
            id: "CVE-2024-0243",
            lastModified: "2024-03-13T21:15:55.173",
            metrics: {
               cvssMetricV30: [
                  {
                     cvssData: {
                        attackComplexity: "HIGH",
                        attackVector: "LOCAL",
                        availabilityImpact: "NONE",
                        baseScore: 3.7,
                        baseSeverity: "LOW",
                        confidentialityImpact: "LOW",
                        integrityImpact: "LOW",
                        privilegesRequired: "HIGH",
                        scope: "CHANGED",
                        userInteraction: "REQUIRED",
                        vectorString: "CVSS:3.0/AV:L/AC:H/PR:H/UI:R/S:C/C:L/I:L/A:N",
                        version: "3.0",
                     },
                     exploitabilityScore: 0.6,
                     impactScore: 2.7,
                     source: "security@huntr.dev",
                     type: "Secondary",
                  },
               ],
            },
            published: "2024-02-26T16:27:49.670",
            references: [
               {
                  source: "security@huntr.dev",
                  url: "https://github.com/langchain-ai/langchain/commit/bf0b3cc0b5ade1fb95a5b1b6fa260e99064c2e22",
               },
               {
                  source: "security@huntr.dev",
                  url: "https://github.com/langchain-ai/langchain/pull/15559",
               },
               {
                  source: "security@huntr.dev",
                  url: "https://huntr.com/bounties/370904e7-10ac-40a4-a8d4-e2d16e1ca861",
               },
            ],
            sourceIdentifier: "security@huntr.dev",
            vulnStatus: "Awaiting Analysis",
            weaknesses: [
               {
                  description: [
                     {
                        lang: "en",
                        value: "CWE-918",
                     },
                  ],
                  source: "security@huntr.dev",
                  type: "Primary",
               },
            ],
         },
      },
   },
}


Log in or create an account to share your comment.

Security Advisory comment format.

This schema specifies the format of a comment related to a security advisory.

UUIDv4 of the comment
UUIDv4 of the Vulnerability-Lookup instance
When the comment was created originally
When the comment was last updated
Title of the comment
Description of the comment
The identifier of the vulnerability (CVE ID, GHSA-ID, PYSEC ID, etc.).



Tags
Taxonomy of the tags.


Loading…

Loading…

Loading…

Sightings

Author Source Type Date

Nomenclature

  • Seen: The vulnerability was mentioned, discussed, or seen somewhere by the user.
  • Confirmed: The vulnerability is confirmed from an analyst perspective.
  • Exploited: This vulnerability was exploited and seen by the user reporting the sighting.
  • Patched: This vulnerability was successfully patched by the user reporting the sighting.
  • Not exploited: This vulnerability was not exploited or seen by the user reporting the sighting.
  • Not confirmed: The user expresses doubt about the veracity of the vulnerability.
  • Not patched: This vulnerability was not successfully patched by the user reporting the sighting.