Post Information Systems Grad School: October 2009

Friday, October 30, 2009

Using Logparser to dump Bluecoat log files into SQL

Working with Bluecoat files in the raw can be time-consuming. Findstr and grep only work so fast. Windows grep is slow. I know SQL syntax OK, so I tend to dump logfiles into databases to analyze them for activity. There are certainly other ways to do it, such as using a reporting tool for Bluecoat. (Splunk's free Bluecoat application, e.g.).

Theoretically, Bluecoat logfiles are the same as W3C web server log files that logparser can consume via the -i:W3C directive.

You can see the fields in a Bluecoat log below.

#Fields: date time time-taken c-ip cs-username cs-auth-group x-exception-id sc-filter-result cs-categories cs(Referer) sc-status s-action cs-method rs(Content-Type) cs-uri-scheme cs-host cs-uri-port cs-uri-path cs-uri-query cs-uri-extension cs(User-Agent) s-ip sc-bytes cs-bytes x-virus-id

For some reason, Bluecoat leaves two spaces between cs(Referrer) and sc-Status, so all the columns to the right of sc(Referrer) past that will be one off. BlueCoat also leaves spaces in cs-categories and surrounds them with quotation marks, so you need to specify -dQuotes:on. Logparser doesn't have a quick and easy way to handle the double-spaces issue, so I wrote a VB Script to handle it. (VBScript is pretty quick at text handling and it's much faster than using search and replace in WordPad or Notepad on a 500-1000 MB File.)

Here's the VBScript:
'start

Set objFSO = CreateObject("Scripting.FileSystemObject")
'change this line to wherever you want to read the input from.
Set objTextFile = objFSO.OpenTextFile("c:\myBluecoatlog.log",1)
Set objNewFile = objFSO.CreateTextFile("c:\myCleanBlueCoatlog.log")
Do Until objTextFile.AtEndOfStream

myString = objTextFile.Readline
objNewFile.WriteLine(Replace (myString, " ", " "))
Loop
'end vbscript
Here's the logparser file:
-------------------start
SELECT TO_LOCALTIME(TO_TIMESTAMP(date, time)) AS date,
time-taken,
c-ip,
cs-username,
cs-auth-group,
x-exception-id,
sc-filter-result,
cs-categories,
cs(Referer) AS Referer,
sc-status AS scStatus,
s-action,
cs-method,
rs(Content-Type) AS ContentType,
cs-uri-scheme,
cs-host,
cs-uri-port,
cs-uri-path,
cs-uri-query,
cs-uri-extension,
cs(User-Agent) AS UserAgent,
s-ip,
sc-bytes,
cs-bytes,
x-virus-id

INTO BlueCoat4
FROM c:\myCleanBlueCoatlog.log
------------------end
And here's the command line for logparser. (Save the logparser file as c:\scripts\log\bluecoat.sql)

logparser file:c:\scripts\log\bluecoat.sql -i:W3C -o:SQL -server:sqlservername -database:BLUECOAT -createtable:ON -dQuotes:ON

Statistics:
-----------

Elements processed: 613076
Elements output: 613076
Execution time: 241.20 seconds (00:04:1.20)
About 2500 lines/sec. Processor utilization is almost zero for SQL and logparser, so it's all about disk time.

The above is from a file that's 310,935,417 bytes large. That means BlueCoat logs are about 507 bytes per line, or 0.5k per line before compression. The last time I checked BlueCoat gz compression, it was about 15% of the original file size. Compressed, the line would cost you 76 bytes.

Monday, October 12, 2009

How I compiled Darkice

Usually, installing an application from source on Linux/Solaris/BSD is easy:

./configure --help (Always look at the help to see the options. It makes a difference if, for instance, you compile php without support for MySQL.)

./configure

make

make install

However, with Darkice, it's prerequisites are numerous, and Darkice's configure doesn't find it's prereqs if they're installed in the standard locations. I've done this twice so far without documenting how I did it, so this time, I'm writing it down.

Here's my configure line:
./configure --with-vorbis-prefix=/usr/local/ --with-lame-prefix=/usr/local/lib/ --with-twolame=prefix=/usr/local/lib/ --with-faac-prefix=/usr/local/lib/

Then you'll get this when you launch darkice:
darkice: error while loading shared libraries: libmp3lame.so.0: cannot open shared object file: No such file or directory

So you need to link that, and when you link that you'll get the next error, so here are both:
ln -s /usr/local/lib/libmp3lame.so.0 /usr/lib/libmp3lame.so.0
ln -s /usr/local/lib/libfaac.so.0 /usr/lib/libfaac.so.0.
If you're wondering what links you're missing, try
ldd /usr/local/bin/darkice
If one of the links to the libraries reads "missing" then that's the one you need to link.

Yum install darkice might work for you, but then again, if you need all the features, it probably won't.
Prereq links are below. Generally ./configure, make, make install works well with all of them, but you really want to track exactly where each lib gets installed -- usually /usr/local/lib/.
Lame
Twolame
libogg
libvorbis
faac

Preqrequisite for faac or twolame -- I forget which:
libsndfile
Prereqs I didn't neeed:
Alsa
Jack

Sunday, October 4, 2009

How to tell when someone Googles you

Case 1: You Google me and click on my page

Yes, I'm using google as a verb. If you Google me and click on one of my pages, my web server logs the information:
1.2.3.4 - - [01/Oct/2009:10:23:41 -0400] "GET / HTTP/1.1" 200 7186 "http://www.google.com/search?hl=en&source=hp&q=larry+s&aq=f&aqi=g10&oq=&fp=7d15299a959dbb33" "Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3 (.NET CLR 3.5.30729)"

As you can see, I get your IP address, a date, an offset to Universal Time (-0400), a verb (GET, in this case / means my default site page), a status code (200=OK), and a referrer. From the referrer, I can tell you Googled me with the phrase "larry s". Finally, I also get some information about the browser you used, Firefox, and the operating system, Windows XP with service pack 2. There's a chance you may have used a anonymizing proxy, but I'd still get an entry. (Generally, Anonymizer says "TuringOS," so I know it's them.)

Case 2: You Google me and don't click on my page.

That's more difficult but not impossible, because I have a Google AdWords account. I bought my own name as a keyword. Google AdWords works by selling keywords for search insertion. It's an open market, with the second-highest bidder winning in a dutch auction that is Google's revenue machine. When you buy a keyword, you get two measures back from Google:

how many impressions it got (viewing)

how many clickthroughs it got. (someone clicks on the ad)

A keyword ad's success is measured by the ratio of impressions to clickthroughs. The more clickthroughs per impression, the better. So if you don't click on my ad link, which I have made irresistable by promising dirt on me, I still know that someone Googled me, because the impression counter increments with each search. If you click on a regular page on my server rather than the keyword ad (Google calls this "organic"), we're also back to case one above. If you don't click on any of my links, I don't get any of the details from case one.

And that's how I know that you Googled me. If you're wondering if you've been Googled, but don't have a web site with logs you can comb through, or don't want to set up a Google AdWords account, try Google's external keyword tool. Just don't forget to un-check the synonyms box.

Thursday, October 1, 2009

Did you know October is Cybersecurity Awareness Month?

Niether did I. Hardly anyone knows, because few people take DHS seriously, and nobody outside of the Federal government has said "Cyber" since the nineties. I attended a computer security conference recently and listened to a panel of current and former federal officials speak about "Cyber" security. They might one day be able to secure government systems, but they're a long way off from protecting you and me online. One of the few things they can do to protect us is to stage a public awareness campaign -- thus we have Cybersecurity Awareness Month.

Why doesn't Google have a Cybersecurity graphic? Online providers don't want you to think about security. Banks don't want you to think about online security. If you thought about security when you signed up for online banking, you might not do it. Without the regulatory agencies, the banks would leave you liable for all losses -- event those caused by the bank's own security lapses, as happened in the UK.

A banking-industry consultant at the same conference said two striking things:

Bank marketers fought tooth and nail against FFIEC regulations requiring two-factor authentication for online banking logons. (That means you need your password AND something else to log on.) Banking marketers want to make easy for you (or a hacker) to log on and transfer funds.

Banking customer service representatives are just as dumb as the customers when it comes to online security.

If your bank account gets hacked, your bank isn't going to be of much help. They might get some money back, but in most cases, they won't. Your money's gone. The same goes for any other account of yours that gets hacked, whether it's Facebook, GMail, or Yahoo. Nobody's going to help you much.

So take the time now to do a few things to ensure your online security.

Use antivirus and make sure it's up to date. If you're on Windows, there are several free antivirus packages available, such as Microsoft Security Essentials , Avast , and Avira . Password-stealing viruses infect computers every day. If you want to tweak out on antivirus effectiveness comparisons, go here.

Patch your computer. It doesn't matter if you're windows, mac, unix, linux or bsd. Patch.

Change your banking password. Change your email password, because all your password resets go there. Change your security questions, because those reset your passwords. If you're using the same password from college, and your college system gets hacked and reveals your password, then they will find your other accounts.

Realize that you are a hundred times more likely to fall for a phishing email than you are to click on an online ad. (Phishing emails are now so common that you might get one that coincides with a recent transaction, making you think it's real.) Now that banks have increased their online security, the hackers are targeting you -- the soft spot.

Also realize there are are now office buildings full of professional hackers working in shifts trying to get to your money. (Another panelist, Chris Roberts, talked about research he had done observing the building in an unnamed country in Eastern Europe. Some of his work is available on McAfee's hacker-commerce site.)

Don't use unsecured wireless networks. Secure your home wireless network. (Replace WEP encryption with WPA or WPA2.)