Session transcripts as one big HTML file

I've updated the link on Cassiopaean Session Transcripts Search to link to the GitHub page you posted here, is that alright?
I was thinking about rewriting book assembly, but using Transcripts Search as the only source of material. Scraping XenForo was an error-prone experience, resulting in some sessions not being retrieved or being badly formatted in the initial stage. It should be much easier for me to just write a script that uses the Transcripts Search API.
 
Just saw @artofdream's find. I think that is probably the best reference too. Thanks @artofdream.
Yeah, agree, and thanks for your offer without which I wouldn't have asked for help. Networking!!! :thup::rockon:

By the way, reading the whole session in question, noticed that it's one of those sessions where only Frank was channelling, so probably the exchange there would need a bit of extra salt as discernment when assessing what was conveyed. FWIW.
 
I was thinking about rewriting book assembly, but using Transcripts Search as the only source of material. Scraping XenForo was an error-prone experience, resulting in some sessions not being retrieved or being badly formatted in the initial stage. It should be much easier for me to just write a script that uses the Transcripts Search API.

I'd be happy to collaborate with you on this and show you how the API works plus how the data is structured. If you'd like, we can start a message chat.

Right now, the English and French versions are the most complete out of all the languages in the API. The current translated Spanish transcripts are the next ones to add.
 
I'd be happy to collaborate with you on this and show you how the API works plus how the data is structured. If you'd like, we can start a message chat.
I have just started a new job, so I do not have much free time at the moment, but I will contact you as soon as my work situation loosens up a little. Apart from books, I have recently started using LLMs more seriously, and I am blown away by how good they are at tasks involving language analysis. I am waiting for grok3 to become available via API, and then we can experiment with analyzing transcripts on a different level. For example, in the style of Clif High, we can ask the LLM to list words that do not belong to the context or are unusual in the context in Cs answers. With 128k token context windows, we can easily go through transcripts year by year (quite possibly even decade by decade).
 
Do you have an epub version as well?
@KJS uploaded several variations in the PDF format. In the following, I worked with the sessions-a4 file and the Calibre application (download). Some of the details below are unnecessary for all those familiar with these or similar programs, but then there might also be some who prefer a longer explanation.

First I added the PDF file to the Calibre e-book library, then converted the PDF format to the EPUB format, which however is not accepted for upload to the Forum, so I subsequently used a compression programme (7-Zip) to get an allowed .ZIP extension. Therefore the attached file needs to be unzipped before use, and 7-Zip can do that too.

Testing the new format, I tried opening it in the Calibre E-book viewer. Since the screen view is larger than for PDF, it appears more readable.

In the Calibre E-book viewer, the arrows on the keyboard can be used to navigate, but one needs to click the displayed text to get the arrow functions to work for the text. The pairs up-down, and left-right appear to work in the same manner.

A question is how the file works on other types of e-book readers, including handheld.
 

Attachments

Here is the compilation in EPUB format. I don't have Calibre on my laptop right now, so I'm unable to convert it to Kindle format (as there's no Kindle export in Pandoc). It's in the ZIP file, as the forum prohibits attaching EPUB files (Windows or macOS should handle ZIP files out of the box).

Sorry that the project is kind of abandoned. I've switched jobs, so I have limited free time, and even that time goes mainly to my daughter, as I'm not able to get an uninterrupted 15 minutes of work besides daily job. That's also partly why the quality of my posts here is so weak.
 

Attachments

Back
Top Bottom