You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have html files that contains inline image data, when I convert to Markdown I got the image embedded in Base64, I'd like an option to avoid this. This is a problem because the resulting markdown is really big.
An option that prevent this would be nice.
The text was updated successfully, but these errors were encountered:
In the interim, you can do pre-processing of your HTML content using HtmlAgilityPack as below to remove that img elements:
stringhtml=@" <html> <body> <img src='data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA' /> <img src='https://example.com/image.jpg' /> </body> </html>";// Load HTML documentHtmlDocumentdocument=newHtmlDocument();document.LoadHtml(html);// Select all <img> tagsvarimageNodes=document.DocumentNode.SelectNodes("//img");if(imageNodes!=null){foreach(varimginimageNodes){stringsrc=img.GetAttributeValue("src",string.Empty);if(src.StartsWith("data:image/")){// Remove the <img> node from the HTMLimg.Remove();}}}// Save or display the cleaned HTMLstringcleanedHtml=document.DocumentNode.OuterHtml;Console.WriteLine(cleanedHtml);
I have html files that contains inline image data, when I convert to Markdown I got the image embedded in Base64, I'd like an option to avoid this. This is a problem because the resulting markdown is really big.
An option that prevent this would be nice.
The text was updated successfully, but these errors were encountered: