Forum: Meta: Markdown bug when text content has HTML tags
1
gravatar for RamRS
3.4 years ago by
RamRS24k
Houston, TX
RamRS24k wrote:

Hello all,

I just noticed this: Markdown style hyperlinks don't work when the text content in the principal text area has HTML. For example, when you moderate->close a post and then try editing the moderation comment to add links, the edit brings up a page with the content already populated with html tags (such as the <p> tag).

I got the links working once the tags were removed. I think the Markdown parser gets into a conflict with a different HTML parser and that's what causes this. Any thoughts on how we can address this?

Thank you!

--
Ram

bug forum meta • 931 views
ADD COMMENTlink modified 16 months ago by h.mon27k • written 3.4 years ago by RamRS24k
1

Also these type of issue should also opened in the Github issue page and cross linked. While it is fine to talk about them here I might occasionally miss some of these topics here.

ADD REPLYlink written 3.4 years ago by Istvan Albert ♦♦ 81k

I did think GitHub, but I'm going to use my usual "run-it-by-community" excuse for the last time now. The next time I am in a similar situation, GitHub will be my go-to spot.

ADD REPLYlink written 3.4 years ago by RamRS24k

Missing content in old posts?. This is one such case I have faced recently.

ADD REPLYlink written 3.4 years ago by venu6.3k
1
gravatar for Istvan Albert
3.4 years ago by
Istvan Albert ♦♦ 81k
University Park, USA
Istvan Albert ♦♦ 81k wrote:

We have posts from three different versions of Biostar, an old Markdown, a CKEditor based HTML and now a new style "CommonMark" type of markdown.

Markdown does allow mixing HTML content into the markdown but will not apply markdown specific instructions for content placed inside HTML tags.

For a few years posts were created directly in HTML rather than Markdown. These were not automatically converted down to Markdown since it was not clear if that an automatic conversion would handle all cases correctly and the goal was to protect the existing content as is.

When one edits an older post that was started out as HTML they would also need to remove the simple formatting tags such as <p> (especially the root tag that makes the whole post HTML) etc to make use markdown instructions.

ADD COMMENTlink written 3.4 years ago by Istvan Albert ♦♦ 81k

This makes total sense. The content in posts and comments are stored in a DB correct? Would it be possible to look for a HTML-to-MD converter that would replace formatting and anchor tags perhaps? I am sure you would have considered this - what are the major challenges?

ADD REPLYlink written 3.4 years ago by RamRS24k

there is a pretty handy library for this I tested that out and it seems to work

https://github.com/aaronsw/html2text

But that alone would not fix a good number of cases, especially those with the iframes, that would require a different parser to be written. Hence this that was a not a full solution we've never applied that.

ADD REPLYlink written 3.4 years ago by Istvan Albert ♦♦ 81k

There are iframes in posts? Wow!

ADD REPLYlink written 3.4 years ago by RamRS24k

only when content is embedded gist, youtube, twitter

ADD REPLYlink written 3.4 years ago by Istvan Albert ♦♦ 81k

Ah. I thought embedding used <embed> or <object> tags - just noticed that W3schools now recommends <iframe>s. This is challenging stuff!

ADD REPLYlink written 3.4 years ago by RamRS24k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 867 users visited in the last hour