Session transcripts as a one big HTML file

Nienna

SuperModerator
Moderator
FOTCM Member
Thank you, KS! This is terrific!

Now, I am not a very tech savvy person so I just have to make sure with whomever can answer this for me. As far as I've seen, there are four different downloads because of corrections. Is that last download 15926619 the only one that needs to be saved? It's a stand-up all on its own so I can delete the other downloads I have? Hope that makes sense.
 

Luks

The Living Force
FOTCM Member
Little tip. Do you know that you can in an easy way merge all your txt files in one txt file?

I did it using Cassiopaea sessions files. You need to open your command line, go to the folder with files and use command "copy" to copy all the content of all these files, after space, you need type: "*.txt" asterisk is stated for "I want all files and ".txt" for files with the known extension. Next, you name your output file: I use the name "sessions".
 

Attachments

  • mergingtxt.PNG
    mergingtxt.PNG
    1.9 KB · Views: 15

Luks

The Living Force
FOTCM Member
Thank you, KS! This is terrific!

Now, I am not a very tech savvy person so I just have to make sure with whomever can answer this for me. As far as I've seen, there are four different downloads because of corrections. Is that last download 15926619 the only one that needs to be saved? It's a stand-up all on its own so I can delete the other downloads I have? Hope that makes sense.

You can download the file I am sharing. There are all C's sessions. And after that, you can open this file using every software to reading, editing, and searching texts you like and can find on the internet.
 

Attachments

  • sessions.txt
    6 MB · Views: 65

goyacobol

The Living Force
FOTCM Member
For those interested in the scraper, its source code is also attached as a separate zip file. Any faults in the scraper are my own and encourage feedback for the same since I'm not a full time programmer, but only have basic programming knowledge which I used to build this scraper. My only intention of building a scraper was to be able to search the transcripts more quickly and easily than the online tool on the forum, which is great as well. My sincere thanks to Laura and the crew for making the transcripts freely available to everyone.

Thanks, @chrismcdude. You may not be "a full time programmer" but you think like one I would say. For me, it is finding a solution to make repetitive actions faster and easier and I will spend a lot of energy sometimes to try and find a solution (is it fun...? sometimes).:nuts:
 

KS

Jedi Master
Thank you, KS! This is terrific!

Now, I am not a very tech savvy person so I just have to make sure with whomever can answer this for me. As far as I've seen, there are four different downloads because of corrections. Is that last download 15926619 the only one that needs to be saved? It's a stand-up all on its own so I can delete the other downloads I have? Hope that makes sense.
Yes, the file I'm posting has "unix epoch" timestamp (seconds since 1 Jan 1970) as a suffix. Generally, the bigger the number is, the better.

Inspired by @chrismcdude, sessions are now processed with "readability" algorithm, to clean HTML code from XenForo related tags. This also fixed cases when session transcript was posted as a quote (few sessions from 2008/2009). Also, I've removed "doctype" tag from the beginning - maybe this will resolve issues related to Chrome browser. AFAIK, there is only one session missing: 22 Feb 2010.

Once again, the newest generated file is attached to this post.
 

Attachments

  • sessions-1592747786.zip
    2.1 MB · Views: 104

goyacobol

The Living Force
FOTCM Member
:nuts:
You can download the file I am sharing. There are all C's sessions. And after that, you can open this file using every software to reading, editing, and searching texts you like and can find on the internet.

@Luks,

I was just looking at the text file you attached and I noticed a couple of things.

Session 10 January 2002:
A: Either way it will come. "Miles to go before you sleep." Keep on going. Destination will be reached.

Q: Anything else we ought to know that we haven't asked?

A: Not for now. Goodnight.

End of Session

The above session runs into the next one which is February 23, 2002. This seems to be consistent throughout that the End of Session is merged into the next session date.

Q: Anything else we ought to know that we haven't asked?

A: Not for now. Goodnight.

End of SessionFebruary 23, 2002

Ark, Laura, Barry T, Rick O, VG, Jeannine & M N****

Q: (L) Hello.

A: Yes. Hello.

Also, with the new forum software the "Click to Expand" feature can create some problems in saving a page where those parts are not expanded.

The {Scott} Cassiopaea Logs: 1994 -2000

(c) {Frank Scott}and Laura Knight Jadcyzk 1994 -2000

(By granting Ms. Knight Jadcyzk co-author status, Mr. "Scott," in light of Ms. Knight Jadcyzk's previous bad faith attempts to use the material without permission, makes the following stipulations:

1) No commercial use of the material whatsoever.

2) The material is to be made freely available in its unedited form, and may be transmitted or used only in that form.

3) No permission is granted or implied for any use of excerpts, beyond the length and for the reasons outlined in the fair use provision of the DMCA, or individual sessions by any person, organization or corporation whatsoever.

(Any further use, after March 1st, 2002, of any material copyrighted as noted above without the express written permission of both authors, or that violates the permission stated in the above copyright notice, will be considered an act of copyright infringement, which, if proven, carries penalties of $20,000 to $100,00 per infraction.

(Please be advised that this warning is an official notice to cease and desist all copyright infringements and any other use of the material which does not have the express written permission of both authors, or which violates the permission granted in the copyright statement.).

Click to expand...

Sometimes it is those little things that drive you up a wall...:nuts:
 

Luks

The Living Force
FOTCM Member
The above session runs into the next one which is February 23, 2002. This seems to be consistent throughout that the End of Session is merged into the next session date.

Yes, I see now. But still good for searching.

I look a bit on the internet, and there is a command like " For %f in (*.txt) do type “%f” >> name.txt & echo. >> name.txt "
and it works doing one break line. I got it from here: 6 Ways To Combine or Merge Multiple Text Files • Raymond.CC

However I think programming this normally with some programming language is better, but sometimes it's fun to play with Cmd. We have the HTML file with the sessions beside 22 Feb 2010. So I leave it for the moment.
 

Voyageur

Ambassador
Ambassador
FOTCM Member
Great work, KS!

Yes, it's handy.

What Ark and I have is even handier. I file all sessions as doc in folders by year. Then, we have a program called "DTSearch" which indexes whatever you tell it to. All the sessions are indexed and I just put in a key word or string and it pulls them up and displays folder and date in the upper window, and the session text with keyword(s) highlighted in lower window; this allows me to identify if that is the session I want. If it is not, I just click the arrow and it takes me to the next occurrence of the word, jumping from session to session as needed.

I also have Gurdjieff's books, the bible, Josephus, Herodotus, Nostradamus, and a few other texts indexed and can search any I choose fast and easy with that handy display. I can't recommend this system highly enough because it sure makes searching easy and efficient.

Wow, I checked this out - if exactly the same (video below). Excellent search tool with many methods one can use.

 

goyacobol

The Living Force
FOTCM Member
Excellent search tool with many methods one can use.

@Voyageur,

I agree it is a commercial grade high powered search tool and Laura probably is making good use of it.

For the average person, I think it might be a bit expensive. I think the lowest price is $199.00

dtSearch® Maze Logo
spacer.png
spacer.png
Desktop with Spider
spacer.png
includes
64-bit
versions
Network with Spider
spacer.png
Web with Spider
spacer.png
Engine for Windows
spacer.png
Engine for Linux
spacer.png
Engine for macOS
spacer.png
Publish (portable media)
spacer.png
Find anything, anywhere, instantly!
spacer.png
dtSearch Desktop with Spider — single user license
Requires Windows.
spacer.png
Add to Cart single user @ $199 (international currency selector available)
 

Chris

Jedi
@Voyageur,

I agree it is a commercial grade high powered search tool and Laura probably is making good use of it.

For the average person, I think it might be a bit expensive. I think the lowest price is $199.00

Notepad++ will do exactly the same work as dtSearch does, at least for transcripts, as I’ve detailed in my previous posts. It can be downloaded for free and will probably do the trick for the most people.

dtSearch will be more useful for researchers (like Laura) who need to quickly lookup text stored within e-books, in multiple formats etc . It is a really powerful software but pricey if the task is to lookup text within transcripts alone.
 

goyacobol

The Living Force
FOTCM Member
Notepad++ will do exactly the same work as dtSearch does, at least for transcripts, as I’ve detailed in my previous posts. It can be downloaded for free and will probably do the trick for the most people.

dtSearch will be more useful for researchers (like Laura) who need to quickly lookup text stored within e-books, in multiple formats etc . It is a really powerful software but pricey if the task is to lookup text within transcripts alone.

Also, Adobe PDF reader has an advanced search feature that does most of the same boolean types of searches.

Adobe Advanced Search.png
 
Top Bottom