Paraphrasing some stuff in a friendslocked entry I found. Good info; I'll try to do it justice. Hat tip to
runeshower.
Google is currently indexing the contents of LiveJournal blogs' RSS feeds.
RSS feeds don't (at the moment) contain a "block robots/spiders" setting. Google is working on how to respect this setting ANYWAY; they expect to have a fix within the next week.
RSS feeds contain only your public entries. If private or friendslocked content shows up on Google Blog Search, it was indexed at a time when those entries were public. To request its removal you'll have to talk to Google about it.
You can't completely shut off your RSS feed, but you CAN shrink it down to contain only the titles of your entries. To do this, go to the LiveJournal console and enter the command set synlevel title.
_____________________________
I've done the "set synlevel title" command, and I have "block robots/spiders" checked. However, now that I'm thinking about this a bit, I'm half inclined to allow spiders and let Google index my posts. One of the reasons I flail about with LJ memories so much is that it's so danged hard to SEARCH. Google is certainly equipped to solve that problem! And hey, all public content was already public well before Google began caching it. Pretending there are folks who can't see it or won't find it is a recipe for getting badly hurt. All I'll have to do is make sure I don't put my e-mail address in my entries. I get enough spam.
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-userinfo.gif)
Google is currently indexing the contents of LiveJournal blogs' RSS feeds.
RSS feeds don't (at the moment) contain a "block robots/spiders" setting. Google is working on how to respect this setting ANYWAY; they expect to have a fix within the next week.
RSS feeds contain only your public entries. If private or friendslocked content shows up on Google Blog Search, it was indexed at a time when those entries were public. To request its removal you'll have to talk to Google about it.
You can't completely shut off your RSS feed, but you CAN shrink it down to contain only the titles of your entries. To do this, go to the LiveJournal console and enter the command set synlevel title.
_____________________________
I've done the "set synlevel title" command, and I have "block robots/spiders" checked. However, now that I'm thinking about this a bit, I'm half inclined to allow spiders and let Google index my posts. One of the reasons I flail about with LJ memories so much is that it's so danged hard to SEARCH. Google is certainly equipped to solve that problem! And hey, all public content was already public well before Google began caching it. Pretending there are folks who can't see it or won't find it is a recipe for getting badly hurt. All I'll have to do is make sure I don't put my e-mail address in my entries. I get enough spam.
no subject
Your final paragraph is very helpful, since I'm flailing in the same way for the same reason. I have bots and spiders blocked because I don't want a bunch of strangers showing up from all over the web -- but as my list of memories grows to where I need search engines just for them, maybe it's worth it. I'll never have enough links to my journal to get a PageRank above 827 or so anyway, so what's to worry about?!
If you unblock spiders, will Google crawl all the way back and index all the public entries? Or only the ones that are posted after the unblocking?
no subject
I don't know whether Google will crawl into the past. Interesting question. It'd be pretty useful to my search capabilities if it did. :-)
no subject
Another thing to know about is that there are "live feeds" of all public postings in real time. That's probably not what google uses (since they have older postings too, though it might be a way to discover feeds to look at), but technorati and that sort of thing do. (I pulled that feed into an appliance at work for a couple of days, as a volume scaling test, but also to see if lj is "interestingly geographic". Lots of good "casual" writing...) I assume synlevel controls that too.
no subject
That'd be my guess, but I really know squat about it. :)
no subject
no subject
no subject
This RSS feed business definitely has me miffed although I am not certain how sensible being miffed is. I may decide that I am being like one of those daft geezers who throw a fit because someone linked to his publically accessible web page without his explicit permission. Or I may decide that I am justified in my upset and do something like friends-lock my LJ or just abandon it.
Meanwhile I think it is the being blindsided and my perception that LJ officialdom gives an insufficient damn.
Thanks again.
no subject
I am not certain how sensible being miffed is.
Yeah, I'm about like you -- perhaps less miffed, but definitely uncomfortable. It broke some assumptions. There's a difference between a post being public and a post being copied automatically and sent to the equivalent of my town's public library. Still, I'm leaning on the side of "I'm being a daft geezer" myself, and I think I'll probably get over it and let Google index the hell outta this journal. Searchers can knock themselves out. My content is really quite boring overall. :-)
no subject
is the code - or more precisely, the version control log, but you can get the code from there. That's at least a place to look for command names which might help find other documentation. (and yes, I found that with google: livejournal cvs console, after searching for: synlevel console turned up the rss feed implementation :-)
(I'm actually pleased that google is doing this - because they're doing it in public, people are finding out where their assumptions were broken, instead of it happening behind their backs - it's not like the *reality* is any different now, just the extent to which people were informed of it. Now they can figure out the kind of control they want, and agitate for it.)
no subject
Philosophically, I agree wholeheartedly with your last paragraph. Finding out where assumptions don't match reality is a GOOD thing. Reality does have a way of winning. Better to know.
no subject
no subject