I’m working on a plugin to parse all posts and gather them into a JSON file to be consumed by a search mechanism. How can I access just the text of the post, with no markup? I’m currently accessing site.posts, then e.g. page.content in loops. This returns the content of the post, but includes newline markers (\n) and Markdown syntax.
I saw another question in which someone wanted to get Markdown processed content in a Jekyll tag plugin, but my case is different: I don't want any markup at all, just the plain text of the post, with no formatting applied.
Below is the key def from my current implementation.
def generate(site)
target = File.open('js/searchcontent.js', 'w')
target.truncate(target.size)
target.puts('var tipuesearch = {"pages": [')
all_but_last, last = site.posts[0..-2], site.posts.last
# Process all posts but the last one
all_but_last.each do |page|
tp_page = TipuePage.new(
page.data['title'],
"#{page.data['tags']} #{page.data['categories']}",
page.url,
page.content
)
target.puts(tp_page.to_json + ',')
end
# Do the last post
tp_page = TipuePage.new(
last.data['title'],
"#{last.data['tags']} #{last.data['categories']}",
last.url,
last.content
)
target.puts(tp_page.to_json)
target.puts(']};')
target.close
end