Post Information Systems Grad School

Wednesday, April 18, 2007

They tried to get my server to join a BotNet

I first noticed a mysterious connection on a netstat:

Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 1 mrtg.sampas.net:42321 223-118-92-213.server:49153 SYN_SENT

I also noticed Apache had opened a shell:

F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
1 S apache 18005 1 0 76 0 - 1282 - Apr10 ? 00:00:00 sh -i

Soon after, perl became a runaway process, consuming 100% of my CPU time. And I thought /sbin/nologin meant user Apache couldn't just get a shell. I updated zlib from 1.2.2 to 1.2.3 to fix a security hole. up2date -u reports everything is up-to-date. (It did that for my old zlib, too.) I don't see any new holes in my applications, MT and Gallery. I did a Nessus scan with recent updates, and all it showed no holes and one warning. I ran clamscan and it didn't find anything, either. Rkhunter found nothing, and nikto gave me the following output:

+ Server: Apache/2.0.52 (Red Hat)
+ Allowed HTTP Methods: GET,HEAD,POST,OPTIONS,TRACE
+ Apache/2.0.52 appears to be outdated (current is at least Apache/2.2.3). Apache 1.3.33 is still maintained and considered secure.
+ / - TRACE option appears to allow XSS or credential theft. See http://www.cgisecurity.com/whitehat-mirror/WhitePaper_screen.pdf for details (TRACE)
+ /usage/ - Webalizer may be installed. Versions lower than 2.10-09 vulnerable to Cross Site Scripting (XSS). CA-2000-02. (GET)

So now I'm stuck looking through my apache access logs, because that's the only thing exposed to the outside world.

I did a capture just while open my firewall for a couple of minutes, and I saw it try to log in to an IRC channel. Ouch. I've been pwned. Fortunately, my firewall stops my server from being used for attacks, and I was able to block the port range used by the IRC bot.

Finally, Red Hat released a lot of new patches for PHP, and I set PerlTaintCheck On in /etc/httpd/conf.d/perl.conf, which was the real problem: user Apache had started listening on port 80 using Perl, so I couldn't even restart httpd.

Next time, I need to check the logs closer and post a network capture of the login process.

Monday, April 9, 2007

I graduate from grad school, get a new job, a new house, and become a father.

I graduated from Grad School with a Master's in Information Systems Technology, focusing on Management Information Systems.

It's official. George Washington University sent me my degree in the mail. They took three and a half months to get it out. Even the registrar didn't change my status until March after a couple of phone calls. A lot has happened since then: I moved into a new (old) house, started a new job, and am about to become a father.

What did I really learn in grad school anyway? I learned a lot, but every class covered, to some degree, entity-relationship diagrams (ERDs), data-flow diagrams (DFDs), and object-oriented diagrams, which can be state charts, class diagrams, and use cases, to name a few. Some classes went so far as to cover the theory behind them. Every class covered the relational database model, which hasn't changed much in thirty years and is still useful and relevant to just about every information system I've ever worked with.

Since IS grad school is part of the business school, we learned to work in teams. It's not about writing code -- it's about finishing projects on time. The funny part is the professors don't teach much about team projects -- they just expect you to manage yourselves.

Monday, October 2, 2006

How to Reverse Engineer a Database with Microsoft Visio

What do we study in Information Systems Grad School? If there's a single topic that comes up in every class, it's databases. If we haven't memorized the first three normal forms by now, we haven't learned much. While few of us will bother going into Boyce-Codd Normal Form, 4NF and 5NF, every specification for a system we write that has a database needs an entity relationship diagram. An ERD is a visual representation of your data model, and your data model is probably the single most important part of any system you design. A good data model will survive several major versions of your software; a poor data model will make your system useless. Thus, we spend a lot of time doing data models and documenting them with ERDs.

As much as I love Visio, drawing the things from scratch is somewhat tedious. It's much easier to design and test in Access. (I have it on good authority that even elite Oracle DBAs who hand-tune Solaris for better performance will design and test in Access just because it's easy.) So what do you do when you have a decent test DB in Access and you don't want to diagram every little change in your masterful Visio ERD? Reverse engineer.

In Visio, it's fairly easy, but there are a couple of spots where it doesn't behave as nicely as it should. I'm going to refer to Visio 2007, still in Beta and free for the download and registration. Visio 2003 is almost the same. Visio 2003 Enterprise Architect Edition will create the database from your diagram, in case you can design an enterprise DB but don't know how to create the tables in SQL. (Not really someone you'd want touching your SQL server.)

1. Open Visio and select New | Software & Database | Database Model Diagram with the units of your choice.
2. Now that you have a database model diagram open, the database menu will appear. Select "Reverse Engineer" off of the database diagram.
3. A confusing dialog box will appear. Use it to verify you have the right drivers installed.
4. For Microsoft Access, choose Microsoft Access as your driver, and hit Next.
5. A username and password dialog box will pop up. Unless you've assigned a username and pw to the database, leave it blank and hit OK.
6. Navigate your filesystem and select your database. Ignore the clunkiness and be grateful that you can see filenames longer than 8.3.
7. When you have found your .MDB file, choose it and hit OK.
8. Select the types of objects you would like to import and hit Next. (No, you don't get stored procedures and triggers in Access, but you would in SQL.)
9. Select the specific tables, queries, etc. you want to see in your diagram and hit Next.
10. Select Yes to add them to your current diagram, and hit Finish. (Select no you have a lot of tables, queries, etc.)
11. You should see your tables in the diagram.
12. To add the crow's feet and cardinality, select options on the database menu. (Database | Options | Document )
13. You get three sections to change here: The General tab covers symbol sets: IDEF1X or Relational, Conceptual, Physical, both, or names based on symbol set. The table tab lets you display keys, indexes, non keys, and the IDEF1X optionality 0. The relationship tab lets you display relationships (duh), crow's feet, cardinality, and referential actions. You must select cardinality before you select crow's feet. (caridinality gets greyed out when crow's feet is checked.)
14. To update your diagram, select "Refresh Model" on the Database menu...

Now that you can see your information model, you know why it's messed up. That database that your business/organization/department runs on -- it's not in any kind of normal form. Or it has about 100 tables more than you thought it should.

The database people in my office have a debate: were the software engineers just trying to make it impossible to wean your organization off of their support, or were they just bad at information modeling?

Friday, September 22, 2006

Visit my new Educational Technologies Blog

Given that I have a new job in education, I started doing a few entries about recent developments in education and technology. I use the framework of educational technology to answer President Bush's question, "Is our children learning?"

At least it explains the dearth of recent posts here.

Thursday, September 14, 2006

Threat Analysis and Modeling Tool, Office Groove Beta

Just when I thought I would stop experimenting with new Microsoft products and hunker down into my new job and my capstone project, I find a couple that will save me time and aggravation: Microsoft's Threat Analysis and Modeling Tool. This dot.Net 2.0 application has a wizard to create for us a CRUD matrix, that will can paste into our system security chapter. A CRUD matrix is simply a table of which users need what kind of access to which tables. It can get more complex if you have column-level security in your database. We don't, so it won't be overly detailed.

The other tool that can save me time, and possibly my group members, is Microsoft's Office Groove 2007 Beta. Like many cool Microsoft products, Groove did not originate at Microsoft. Most people I talk to about Groove don't really get what it does, but I blame that more on bad IT and IS metaphors than anything else. Groove is simply a shared workspace where a team can share documents, discussions, and contacts. When you set up Groove, you can set up one or more shared workspaces. You then add specific files to that workspace. Then all your invited and verified team members can access those documents whether you are online or off. No more uploading to Sharepoint or Blackboard, or whatever. You've got anywhere, anytime sharing, provided you have an Internet connection. I have it on my work computer and my home computer and share documents between them without having to email them back and forth. So far, I have no firewall issues either.

What if you want to secure your data and not have it stored on third-party servers? Use Groove Server, and give more money to Microsoft. Before the Office 2007 launch, both Groove and Groove server are free for the download, registration and product keys required. Try it and at least you'll know what people are talking about. The next big thing is enabling teamwork and collaboration and making the world "flat ."

Friday, August 4, 2006

Services for Unix in Six Easy Steps

After tiring of redoing samba.conf files over and over again, I finally tried out Microsoft's (free) Services for Unix for simple file sharing between my Fedora Core 4 box and servers on my domain.

1. Copy over /etc/passwd and /etc/group to a secure folder on the windows machine where you are going to install SfU.
2. Download and install Microsoft's Services for Unix, and tell it to use password files during the installation process. (This is not a lesson on setting up a NIS domain -- just connecting one Linux box to your Windows server quickly and reliably. SfU installs several other items by default, like Unix Perl and grep; ActiveState Perl is optional. You want NFS Server.) Reboot.
3. In Windows, right click on the folder you'd like to share, click and NFS sharing tab, and select "share this folder."
4. On your linux box, as root, add a line like this to /etc/fstab:
myserver.com:/somefolder /somefolder nfs defaults 0 0
5. On linux, mount /pub
6. cd /pub and ls -la to your heart's content.

There are many security implications for Windows and Linux, like sharing your password and group files, and I'll sniff and trying cracking the passwords later.

Thursday, August 3, 2006

Business Intelligence Studio: A Wizard for your Data Mart

My grad school project group finished our data mart for class. I learned a few things about SQL Server 2005 Analysis Services, Reporting Services, and Business Intelligence Studio along the way. One of our group members is a SAS programmer, so he provided us with simulated data: attendance records for a theoretical amusement park that included zip code and promotion type, with zip-code ACORN differences. Our idea was that if you came to the park with a coupon, we would know from the bar code where you came from; if you didn't have a coupon, the gate would ask you your zip code, just like at the retail store.

We had six million rows in our fact table, which included dateID, promotionID, zipID, and attendance, giving us three dimensions. We were going more for scalability rather than trying to pile in a lot of dimensions. I took the flat files from our SAS master and imported them into a SQL 2005 database. For some reason, SQL 2005's import tool defaults to a nvarchar(50) type. While six million records isn't much, the records were fixed-size and much smaller, so I was able to stuff them in an nchar type that was small and efficient because it doesn't require an offset column array. Our six million fact table records matched to 5 types of promotion, about 10,000 zip codes, and individual days for 3 years.

Once we created the database, I exited SQL Server Management Studio and opened up a new Business Intelligence Studio project. One difference between Management Studio and BI Studio is that BI Studio does not use SQL authentication. You must use Windows/Domain accounts. There is no sa in Business Intelligence Studio.

Once in BI Studio,you create a data source, just like with any project that involves a database. Then you create a data source view by selecting the tables you want to use and creating a dimensional model diagram. Microsoft calls this their Unified Dimensional Model. While your fact table doesn't need a primary key, your dimension tables do. The arrows should be pointing from your fact table to the dimension tables. I don't know why, but I often get this wrong when I'm creating a new data source view.

Finally, your fact table columns need to reference the right dimensions. Matching these up took a little longer than expected because I had to guess which fields were what, because our SAS genius didn't document which numbers were what. I looked at the 8-digit numbers and guessed wrong. The auto-build cube wizard worked fine, and the dimensions looked OK. Processing the cube failed when it timed out, and the error messages didn't say anything about orphaned fact rows.

Once the columns were matched up correctly, everything worked fine, and the auto-build cube wizard is pretty impressive. If you did everything right, building the cube should give you dimensions. You can create new dimensions if the wizard missed some. The time dimension has many options. At first I set time to be a regular dimension to avoid promblems: is day of week a number or a name? Monday or 1? While this will give you good reports, the order of weekdays and months will be alphabetical. (Which makes be think we should rename all month and day names so they can be alphabetical and still be in order, or re-sort them so that April is the first month and Friday the first day of the week.) Setting the time dimension as a time dimension rather than a "standard" dimension will fix this.

Processing the cube took under five minutes. (You also need apppropriate account permissions to process the cube) If the cube processes sucessfully, you can browse pivots immediately with the browser in BI Studio. Once you see that your dimensions work, you can start connecting to SQL Server Analysis services from Excel or from SS Report Server. (Again, with appropriate Windows Domain accounts.)

Microsoft's BI Studio Cube and Dimension wizards work the way they should: if you have defined your dimensional model accurately and your data are valid, they'll create the right dimensions for you. The wizards are no substitute for a valid dimensional model and they can't fix bad data or orphaned records.

Next: Creating Pivot Table Reports using our cube and SQL Server Reporting Services for IIS.