fastix®: HTML-Seite parsen / Session Cookies

Beitrag lesen

Moin!

Danke! Wo muss der denn auf dem Server hin?

Ganz einfach dahin, wo es das Tool Deiner Wahl (z.B. wget) als Cookie an den Server versenden kann.

RTFM!

man wget

[...]
      --load-cookies file
           Load cookies from file before the first HTTP
           retrieval.  file is a textual file in the format orig­
           inally used by Netscape's cookies.txt file.

You will typically use this option when mirroring
           sites that require that you be logged in to access
           some or all of their content.  The login process typi­
           cally works by the web server issuing an HTTP cookie
           upon receiving and verifying your credentials.  The
           cookie is then resent by the browser when accessing
           that part of the site, and so proves your identity.

Mirroring such a site requires Wget to send the same
           cookies your browser sends when communicating with the
           site.  This is achieved by --load-cookies---simply
           point Wget to the location of the cookies.txt file,
           and it will send the same cookies your browser would
           send in the same situation.  Different browsers keep
           textual cookie files in different locations:

Netscape 4.x.
               The cookies are in ~/.netscape/cookies.txt.

Mozilla and Netscape 6.x.
               Mozilla's cookie file is also named cookies.txt,
               located somewhere under ~/.mozilla, in the direc­
               tory of your profile.  The full path usually ends
               up looking somewhat like ~/.mozilla/default/some-
               weird-string/cookies.txt.
      --load-cookies file
           Load cookies from file before the first HTTP
           retrieval.  file is a textual file in the format orig­
           inally used by Netscape's cookies.txt file.

You will typically use this option when mirroring
           sites that require that you be logged in to access
           some or all of their content.  The login process typi­
           cally works by the web server issuing an HTTP cookie
           upon receiving and verifying your credentials.  The
           cookie is then resent by the browser when accessing
           that part of the site, and so proves your identity.

Mirroring such a site requires Wget to send the same
           cookies your browser sends when communicating with the
           site.  This is achieved by --load-cookies---simply
           point Wget to the location of the cookies.txt file,
           and it will send the same cookies your browser would
           send in the same situation.  Different browsers keep
           textual cookie files in different locations:

Netscape 4.x.
               The cookies are in ~/.netscape/cookies.txt.

Mozilla and Netscape 6.x.
               Mozilla's cookie file is also named cookies.txt,
               located somewhere under ~/.mozilla, in the direc­
               tory of your profile.  The full path usually ends
               up looking somewhat like ~/.mozilla/default/some-
               weird-string/cookies.txt.

Internet Explorer.
               You can produce a cookie file Wget can use by using the
               File menu, Import and Export, Export Cookies.  This has
               been tested with Internet Explorer 5; it is not guaranteed
               to work with earlier versions.

Other browsers.
               If you are using a different browser to create your cook­
               ies, --load-cookies will only work if you can locate or
               produce a cookie file in the Netscape format that Wget
               expects.

If you cannot use --load-cookies, there might still be an
           alternative.  If your browser supports a cookie manager'',            you can use it to view the cookies used when accessing the            site you're mirroring.  Write down the name and value of the            cookie, and manually instruct Wget to send those cookies,            bypassing the official'' cookie support:

wget --cookies=off --header "Cookie: I<name>=I<value>"

[...]

MFFG (Mit freundlich- friedfertigem Grinsen)

fastix®

--
Als Freiberufler bin ich immer auf der Suche nach Aufträgen: Schulungen, Development. Auch  für seriöse Agenturen.