+ Reply to Thread
Page 1 of 2
1 2 LastLast
Results 1 to 5 of 9

Thread: removing session ID for Googlebot ?

  1. #1
    tiendung619299
    Guest

    removing session ID for Googlebot ?


    php Code:

    if($_SERVER['HTTP_USER_AGENT'] != 'Googlebot'){
        session_start();
        }

    Would this do the trick ?
    More Information:
    1. 33 posts - 16 authors - Last post: Oct 16, 2006All that content might be seen as duplicate by Googlebot and cause many pages to go supplemental and others to completely drop from Google's radar
    2. But, when a user accesses the site (cookies disabled) the Session id should exist
    3. 1 post - 1 author - Last post: Nov 20, 2008By instant message: At the top of the Yugma control panel, click on the Copy Session URL/ID button (the round button to the right of the Session ID and Conference Call number
    4. The Session timed out error is an issue I have come across once or twice but haven't managed to recreate
    5. 14 posts - 6 authors - Last post: Jun 5, 2009I take it if you remove the session check from your upload

    More:


  2. #2
    moneyisall91
    Guest

    I don't do any sites that do sessions with php, but I'd assume you have to use a regular expression instead of just 'GoogleBot' because they have different version numbers and the "new" one right?



    something like !~ /googlebot/i



    Just an idea.
    More Information:
    1. Method one: Remove Session IDs for specific Search Engine bots by recognizing their 'User-Agent' HTTP header strings
    2. about not removing the entries, can you please create jira issue? I will try to look into it ASAP
    3. The solution, for DreamHost, was to upgrade to PHP 5, where the session id was OFF by default
    4. disable automatic session start ## before autoload was initialized php_flag session
    5. Is there any chance the verification file could check to see if a there has been a session folder specified in
    6. How to disable Google bot indexing with session based crawling?For example if anybody join to my site, my site makes a session id and creates an english language cookie
    7. 12:51 Ticket #945 ($_SESSION sharing among multiple php apps served from same website) created by vipsoft: Session conflicts may arise
    8. Could anyone point me in the right direction for details on how to check if the user is Googlebot etc and remove the session ID if it is
    9. I don't do any sites that do sessions with php, but I'd assume you have to use a regular expression instead of just 'GoogleBot' because they have different version numbers and the "new" one right?

  3. #3
    ts3
    Guest

    I think preg_match is what you would use in php, (I'm a perl coder not php)



    edit: try this?


    Code:

    if(preg_match("/googlebot/i", $_SERVER['HTTP_USER_AGENT']) != 1){
    session_start();
    }

    More Information:
    1. indexOf(GOOGLEBOT_AGENT_STRING) > -1) { isGoogleBot = true; } } } if (isGoogleBot) { // wrap response to remove URL encoding HttpServletResponseWrapper
    2. To remove a url, you need to make the page return a 404 error for that session id url
    3. Tipically when the bot takes too much bandwith is that it got stuck on a script such as OsCommerce or a like that has sessions
    4. One note on removing login, this may or may not be known but qwikioffice does allow for a user to a member of several groups
    5. You may need to remove the session IDs and other variables in your site's URLs

  4. #4
    concucu021
    Guest

    Ok, I will check the user agent string with regular expressions then ... thanks.
    More Information:
    1. John Mueller: The standard definition of cloaking is to show Googlebot something different than you would show your users
    2. I'd bet removing the session ids from the links would get
    3. uk, the Fetch as Googlebot tool shows my site as unreachable
    4. But I don't think that is a problem with Google, because if you use Googlebot user agent you won't see sessids
    5. 9 posts - 6 authorsIf I might add one suggestion: could you remove 'Session manager saved sessions' from 'Tools/clear recent history
    6. 1 post - 1 author - Last post: Mar 31, 2009Something I wrote at work to quickly start multiple tcpdumpsession on remote hosts and dump the output to wireshark sessions on my desktop
    7. It wasn't the php session! I'm sorry, but i discovered it was because of the simultanious upload
    8. We had used a tracking similar to a session ID and tried to get different files in different parts of the site indexed and ranked with different IDs

  5. #5
    qwedcxzas
    Guest

    strpos or strstr would be a lot faster than preg_match:


    Code:

    if( strpos( $_SERVER['HTTP_USER_AGENT'], "Googlebot" ) !== false ) {
    // we've found a googlebot
    }else{
    // session initialization code
    }

    More Information:
    1. 2 posts - 2 authors - Last post: Jan 5, 2008global $user; if ($user->data['user_id'] == ANONYMOUS OR $user->data['is_bot'] ) { $session_id = false; } // End SEO phpBB
    2. 4 posts - 4 authors - Last post: Jul 23If I might add one suggestion: could you remove 'Session manager saved sessions'
    3. For instance, if a ? indicates a session ID, you may want to exclude all URLs that contain them to ensure Googlebot doesn't crawl duplicate pages
    4. I have searched for googlebot 404 from java and all the fixes are directed at inline scripts with urls in them
    5. */ unset($_SESSION['username']); /* Demonstrate that session variable is indeed gone
    6. Crawl: Googlebot can extract frame/iframe URLs discovered at crawl time

+ Reply to Thread
Page 1 of 2
1 2 LastLast

Similar Threads

  1. Question about Googlebot and getting spidered
    Ok, Most of my sites are getting visited by googlebot daily but does anybody knows why googlebot sometimes crawles 300.000 Bytes and in other...
  2. Session IDs in URL
    The script I'm using attaches session IDs to my site URLs. Does this affect the search engine readability of my site, and will it reduce the chance...
  3. Getting googlebot on my forum
    Hi im wondering how to get google on my forum more often can any one help me out please thanx
  4. Googlebot visits but still not many pages
    I have a site that was launched in late June in the usual manner that I do it with a few well placed links on some other well indexed sites and then...
  5. GoogleBot Slams My Site
    Googlebot came though my site eating up 3.12 Gigs worth of badwith. OMG I better do some directory blocking before they come back and eat up...

Visitors found this page by searching for:

Nobody landed on this page from a search engine, yet!

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
Automatic Translations (Powered by Powered by Google):
Afrikaans Albanian Arabic Belarusian Bulgarian Catalan Chinese Croatian Czech Danish Dutch English Estonian Filipino Finnish French Galician German Greek Hebrew Hindi Hungarian Icelandic Indonesian Irish Italian Japanese Korean Latvian Lithuanian Macedonian Malay Maltese Norwegian Persian Polish Portuguese Romanian Russian Serbian Slovak Slovenian Spanish Swahili Swedish Taiwanese Thai Turkish Ukrainian Vietnamese Welsh Yiddish