Session transcripts as one big HTML file

Thanks @KS . EPUB would be great and with that Calibre can transform EPUB in MOBI file or AZW3.
I've tried Calibre's ebook-convert utility to create AZW3 from EPUB and it works nicely. Here's what it looks like on a Kindle app (which should render the book the same as Amazon's e-Ink reader):
photo_2022-09-16_09-45-55.jpg
I've spotted some minor formatting errors, but there should be no missing content (I'm 95% sure about that, but not fully certain without reading those 4000+ pages). I'll try to fix those in the coming week, and also make the repository public. I've attached a file that is a ZIP archive (the forum doesn't like attaching AZW3s) containing Kindle-compatible book format. Let me know if it works for you.
 

Attachments

  • sessions-kindle.zip
    8.8 MB · Views: 34
I've fixed a lot of formatting issues and compiled all the documents once again. There are two PDFs, one of which is intended to be viewed on a small screen like a 7" tablet or e-Ink reader. ZIP file contains files that can be used on Kindle and other readers. Word document could be also a good option for searching and making notes. That's it for now, I'll fine-tune those in a matter of next weeks, according to your comments :-)
 

Attachments

  • sessions-small.pdf
    15.1 MB · Views: 49
  • sessions-ebook.zip
    16.2 MB · Views: 45
  • sessions.pdf
    13.6 MB · Views: 76
  • sessions.docx
    7.8 MB · Views: 38
I've fixed a lot of formatting issues and compiled all the documents once again. There are two PDFs, one of which is intended to be viewed on a small screen like a 7" tablet or e-Ink reader. ZIP file contains files that can be used on Kindle and other readers. Word document could be also a good option for searching and making notes. That's it for now, I'll fine-tune those in a matter of next weeks, according to your comments :-)
Thank you so much for your had work @KS; Much appreciated! :clap:
 
I've fixed a lot of formatting issues and compiled all the documents once again. There are two PDFs, one of which is intended to be viewed on a small screen like a 7" tablet or e-Ink reader. ZIP file contains files that can be used on Kindle and other readers. Word document could be also a good option for searching and making notes. That's it for now, I'll fine-tune those in a matter of next weeks, according to your comments :-)
Thank you, @KS, for the beautiful results of your efforts to format the texts of the sessions for readability.
Hi all, I just wanted to inform you, that the transcripts file is also available here: https://liberty239.github.io/
You can right click, and choose "save file as..." from the context menu to preserve it on your computer.
Contents of document in the link above will be updated automatically (even if I go to 5D) on 1st and 15th of every month.
One observation is that the link does not continue beyond April 23, 2022. However, updates are less necessary now that there is a formatted version. The formatted version ends with August 27, 2022. This means that if one complements a search in the PDF, with a search in the sessions not yet included in the PDF, then one has covered all.
 
One observation is that the link does not continue beyond April 23, 2022. However, updates are less necessary now that there is a formatted version. The formatted version ends with August 27, 2022. This means that if one complements a search in the PDF, with a search in the sessions not yet included in the PDF, then one has covered all.
Yes, that's true, I'll probably kill that account (and website) because something broke with the forum scraper, and I hadn't had time to find out what exactly. Also, automatic generation wasn't error-free plus I don't feel that the material should be hosted on some obscure domain. This entire new compilation is based on my skimming through all the sessions, a few of them needed manual editing because the conversion algorithm wrongly interpreted some things (for example, one session was posted as a quote). I need to think about how to automate building. If sessions are now one per month or so, I should be within the limits of free GitHub (the place where the sources are kept) action minutes (time quota for "things" that can be done for ex. during repository update). I've also posted the link to the public repository in a private section of the forum (I don't want to reveal my identity publicly so easily), so anyone that knows how to install LaTeX and Pandoc on their GNU/Linux or macOS PC can build it by himself (if someday I'll suddenly go to 5D;-)).
I've attached a compilation containing the latest session. 2868 pages for that bigger, trade-sized PDF... Oh my!
 

Attachments

  • sessions.docx
    7.8 MB · Views: 19
  • sessions.pdf
    13.7 MB · Views: 51
  • sessions-small.pdf
    15.3 MB · Views: 25
@KS I would like to offer you my help if it is still needed. I don't know if you are still developing your GitHub - liberty239/cassiopaea-tools site or where you are at the moment. Nor do i have any experience with Go (if i'm not mistaken that is the language you use) but i'm willing to learn.
Maybe the Chateau already have the transcripts in a more or less uniform format since Laura uses DTSearch and needs to somehow load them. My idea was that maybe we could create a sort of standard for the transcripts which could be also used to be parsed by your application as well as DTSearch and any other used program.
I don't know if the Chateau have their own Servers (maybe @Scottie could help clarify that). In case they can host a site, we could eliminate the possible tempering of the material from a third party and they would have the control over what/how/when.
I'm just stating some ideas that i have but they might not be feasible or possible. Anyway, tell me if you want/need help in this project.
 
@KS I would like to offer you my help if it is still needed. I don't know if you are still developing your GitHub - liberty239/cassiopaea-tools site or where you are at the moment. Nor do i have any experience with Go (if i'm not mistaken that is the language you use) but i'm willing to learn.
Maybe the Chateau already have the transcripts in a more or less uniform format since Laura uses DTSearch and needs to somehow load them. My idea was that maybe we could create a sort of standard for the transcripts which could be also used to be parsed by your application as well as DTSearch and any other used program.
I don't know if the Chateau have their own Servers (maybe @Scottie could help clarify that). In case they can host a site, we could eliminate the possible tempering of the material from a third party and they would have the control over what/how/when.
I'm just stating some ideas that i have but they might not be feasible or possible. Anyway, tell me if you want/need help in this project.
Well, I've just deleted that old GitHub account. If you want to play with a new one, I can link you via private messages. The new repository is basically a collection of Markdown documents (each for every session) and Pandoc automation scripts, with added Docker support for reproducibility. I invested some time and reviewed almost all of the documents in regard to something missing, but you are right that it's just my work, not affiliated with FOTCM in any way, so one should always question the validity of the content if not sure.

If you have some interesting ideas on what could further be done with those documents, I'm open and very glad that someone would like to contribute. For example, each document can be tagged with some automatically-generated keywords for better searchability, etc. As for the automatic building (when a new session is committed), I think that we can trust GitHub and its Actions (which are run on Azure), I just need to find some time to do that.
 
Well, I've just deleted that old GitHub account. If you want to play with a new one, I can link you via private messages. The new repository is basically a collection of Markdown documents (each for every session) and Pandoc automation scripts, with added Docker support for reproducibility. I invested some time and reviewed almost all of the documents in regard to something missing, but you are right that it's just my work, not affiliated with FOTCM in any way, so one should always question the validity of the content if not sure.
Ah ok. So the current state of the project is this thread with the downloadable zips. I wasn't sure if you are actively developing both the liberty site and the documents.

If you have some interesting ideas on what could further be done with those documents, I'm open and very glad that someone would like to contribute. For example, each document can be tagged with some automatically-generated keywords for better searchability, etc. As for the automatic building (when a new session is committed), I think that we can trust GitHub and its Actions (which are run on Azure), I just need to find some time to do that.
Yes, i also thought about tags and think that they would be helpful. This could also be a mix between automatically-generated and additional input from forum members if someone thinks that a specific tag is missing. Maybe something similar to the 📚 Cassiopaean Session Transcripts by date thread just for tags.

Just tell me if you need any help and i'll do my best.
 
Back
Top Bottom