Saturday, September 17th, 2005 10:01 am
Paraphrasing some stuff in a friendslocked entry I found. Good info; I'll try to do it justice. Hat tip to [livejournal.com profile] runeshower.

Google is currently indexing the contents of LiveJournal blogs' RSS feeds.

RSS feeds don't (at the moment) contain a "block robots/spiders" setting. Google is working on how to respect this setting ANYWAY; they expect to have a fix within the next week.

RSS feeds contain only your public entries. If private or friendslocked content shows up on Google Blog Search, it was indexed at a time when those entries were public. To request its removal you'll have to talk to Google about it.

You can't completely shut off your RSS feed, but you CAN shrink it down to contain only the titles of your entries. To do this, go to the LiveJournal console and enter the command set synlevel title.

_____________________________


I've done the "set synlevel title" command, and I have "block robots/spiders" checked. However, now that I'm thinking about this a bit, I'm half inclined to allow spiders and let Google index my posts. One of the reasons I flail about with LJ memories so much is that it's so danged hard to SEARCH. Google is certainly equipped to solve that problem! And hey, all public content was already public well before Google began caching it. Pretending there are folks who can't see it or won't find it is a recipe for getting badly hurt. All I'll have to do is make sure I don't put my e-mail address in my entries. I get enough spam.
Saturday, September 17th, 2005 10:48 pm (UTC)
Trying to crawl back after a long absence ....

Your final paragraph is very helpful, since I'm flailing in the same way for the same reason. I have bots and spiders blocked because I don't want a bunch of strangers showing up from all over the web -- but as my list of memories grows to where I need search engines just for them, maybe it's worth it. I'll never have enough links to my journal to get a PageRank above 827 or so anyway, so what's to worry about?!

If you unblock spiders, will Google crawl all the way back and index all the public entries? Or only the ones that are posted after the unblocking?
Monday, September 19th, 2005 02:53 am (UTC)
*wave* hi!

I don't know whether Google will crawl into the past. Interesting question. It'd be pretty useful to my search capabilities if it did. :-)
Sunday, September 18th, 2005 05:27 am (UTC)
ah, I was wondering why all of your entries showed up (title only, blank content) in my rss reader. They still serve as a reminder to check the friends page too. (One of these days I'll make my reader use an authenticating proxy; I'm pretty sure you see friends-locked posts in rss if you fetch it when logged in...)

Another thing to know about is that there are "live feeds" of all public postings in real time. That's probably not what google uses (since they have older postings too, though it might be a way to discover feeds to look at), but technorati and that sort of thing do. (I pulled that feed into an appliance at work for a couple of days, as a volume scaling test, but also to see if lj is "interestingly geographic". Lots of good "casual" writing...) I assume synlevel controls that too.
Monday, September 19th, 2005 02:54 am (UTC)
I assume synlevel controls that too.

That'd be my guess, but I really know squat about it. :)
Sunday, September 18th, 2005 03:07 pm (UTC)
For useful. Thank you very much. [livejournal.com profile] 98 is concerned about the issue of seeing parts of locked posts in a google search. I've pointed him at your post.
Monday, September 19th, 2005 02:55 am (UTC)
Glad it's useful! :)
Sunday, September 18th, 2005 04:14 pm (UTC)
Thanks much for the info. Where is it documented? I tried using the 'help' command at the console to get a list of properties in case there is anything else I might wish to set but no joy.

This RSS feed business definitely has me miffed although I am not certain how sensible being miffed is. I may decide that I am being like one of those daft geezers who throw a fit because someone linked to his publically accessible web page without his explicit permission. Or I may decide that I am justified in my upset and do something like friends-lock my LJ or just abandon it.

Meanwhile I think it is the being blindsided and my perception that LJ officialdom gives an insufficient damn.

Thanks again.
Monday, September 19th, 2005 03:00 am (UTC)
I haven't the *foggiest* where it's documented. For all I know there are millions of useful commands that could be entered at the console, and I don't know where to find information on them. If I find out I'll post. (I may not go looking soon though.)

I am not certain how sensible being miffed is.

Yeah, I'm about like you -- perhaps less miffed, but definitely uncomfortable. It broke some assumptions. There's a difference between a post being public and a post being copied automatically and sent to the equivalent of my town's public library. Still, I'm leaning on the side of "I'm being a daft geezer" myself, and I think I'll probably get over it and let Google index the hell outta this journal. Searchers can knock themselves out. My content is really quite boring overall. :-)
Monday, September 19th, 2005 05:43 am (UTC)
http://cvs.livejournal.org/browse.cgi/livejournal/cgi-bin/console.pl

is the code - or more precisely, the version control log, but you can get the code from there. That's at least a place to look for command names which might help find other documentation. (and yes, I found that with google: livejournal cvs console, after searching for: synlevel console turned up the rss feed implementation :-)

(I'm actually pleased that google is doing this - because they're doing it in public, people are finding out where their assumptions were broken, instead of it happening behind their backs - it's not like the *reality* is any different now, just the extent to which people were informed of it. Now they can figure out the kind of control they want, and agitate for it.)
Monday, September 19th, 2005 04:12 pm (UTC)
Thanks for the link!

Philosophically, I agree wholeheartedly with your last paragraph. Finding out where assumptions don't match reality is a GOOD thing. Reality does have a way of winning. Better to know.
Monday, September 19th, 2005 04:13 am (UTC)
Very useful! Thank you.
Monday, September 19th, 2005 04:14 pm (UTC)
Y'welcome! It feels incomplete to me -- contact google how? what does synlevel control exactly, and what else can it be set to? -- but I'm glad [livejournal.com profile] runeshower posted it 'cause it's a lot better'n nothing.