I ran into a rare problem recently with a large Postgresql database that was filling up the local disks of a server. The database was large, over 100 GB and about 300 million records. There was a lot of churn and it had not been vacuumed in a long time. When I manually ran a vacuum on it, there was not enough working disk space to complete the operation, creating a bind.
What I decided to do instead of using vacuum was to dump it to a remote backup location, then drop the database and restore it from the remote dump. I used SSH to run the remote commands.
Dump a remote Postgresql database to the local machine
ssh user@remote-database-server 'pg_dump database-name -t table-name' > table-name.sql
Restore a remote Postgresql database dump to the local database server
ssh user@backup-machine 'cat table-name.sql' | psql -d database-name
Note that the dump command is run from the backup machine and the restore command is run from the database server.
Also note the single quotes around certain parts of the command.
Wednesday, September 15, 2010
Thursday, August 19, 2010
The Google Chrome experiment
When Google first announced their Chrome browser, packed with a revamped JavaScript engine (V8) and support for offline web apps, I thought I would give it a spin. The first couple of releases were for Windows and Linux -- no Mac version. Those early versions were a little clunky and appeared to offer no better performance than other popular browsers. So I moved on.
When the Mac version became available, it was a much more polished browser. Another theoretical selling point was that each tab ran as a separate process so one crashed tab would not crash the whole browser. I decided to give Chrome a serious work out on my Mac at home.
Things started out well enough and performance was good. I perused the help and learned some of the short cuts. After about three good weeks, something went wrong. I don't know if it was an update, a growing cache, or what, but it started slowing down. Then, it started having problems loading pages from web sites that worked fine in other browsers. It is possible that it even caused wireless network issues, though that is just speculation at the moment. I need to do some more research to see if the problems were related to Chrome.
For now, I am sticking with Firefox as my main browser on the Mac. I'll come back and try Chrome out after the next major release.
When the Mac version became available, it was a much more polished browser. Another theoretical selling point was that each tab ran as a separate process so one crashed tab would not crash the whole browser. I decided to give Chrome a serious work out on my Mac at home.
Things started out well enough and performance was good. I perused the help and learned some of the short cuts. After about three good weeks, something went wrong. I don't know if it was an update, a growing cache, or what, but it started slowing down. Then, it started having problems loading pages from web sites that worked fine in other browsers. It is possible that it even caused wireless network issues, though that is just speculation at the moment. I need to do some more research to see if the problems were related to Chrome.
For now, I am sticking with Firefox as my main browser on the Mac. I'll come back and try Chrome out after the next major release.
Thursday, July 1, 2010
Finding IPs connected to your web server
On Mac OS X
note: this also shows outgoing connections from web browsers
Get all IPs connected to your web server:
netstat -nat | sed -n -e '/ESTABLISHED/p' | awk '{print $5}' | sed 's/\./ /g' | awk '{print $1"."$2"."$3"."$4}' | sort
Get all unique IPs connected to your web server:
netstat -nat | sed -n -e '/ESTABLISHED/p' | awk '{print $5}' | sed 's/\./ /g' | awk '{print $1"."$2"."$3"."$4}' | sort | uniq -c | sort -n
On Linux
Get all IPs connected to your web server:
netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort -n
Get all unique IPs connected to your web server:
netstat -ntu | sed -e 's/::ffff://g' | awk '{print $5}' | cut -d : -f1 | sort | uniq -c | sort -n
Sunday, May 16, 2010
Monday, March 22, 2010
The mysterious Data Center Technical Specialist certification
On March 4, I received an email from Novell Technical Training that I had received a Novell certification for "Data Center Technical Specialist". This came as a surprise to me because I had not applied for this certification, taken any tests for this certification, not had I even heardof this certification.
Due to a cross marketing agreement with the Linux Professional Institute, I had applied for and received the Novell Certified Linux Administrator certification a few weeks prior. This seemed legitimate to me, as I had extensive experience with SUSE Linux and my Linux skills are still sharp. I continue to perform Linux server administration as part of my daily work.
However, I am not quite sure what the Data Center Technical Specialist is supposed to represent. Confused, I wrote to Novell Training asking what the certification meant. I received this equally mysterious reply:
I searched the official Novell Certification web site, and this certification does not appear anywhere. I suspect, but can't confirm that it may be part of the Solution Provider program. As such, it is probably more of a value to Novell sales than to an individual technician. I remain somewhat baffled.
Due to a cross marketing agreement with the Linux Professional Institute, I had applied for and received the Novell Certified Linux Administrator certification a few weeks prior. This seemed legitimate to me, as I had extensive experience with SUSE Linux and my Linux skills are still sharp. I continue to perform Linux server administration as part of my daily work.
However, I am not quite sure what the Data Center Technical Specialist is supposed to represent. Confused, I wrote to Novell Training asking what the certification meant. I received this equally mysterious reply:
Thank you for contacting Novell Training Services. You have received the certification as part of some changes we have made recently to our partner requirements. As part if these changes, some of the exams/certifications you have now count toward the new certification.
I searched the official Novell Certification web site, and this certification does not appear anywhere. I suspect, but can't confirm that it may be part of the Solution Provider program. As such, it is probably more of a value to Novell sales than to an individual technician. I remain somewhat baffled.
Wednesday, March 3, 2010
Fuzzy string matching in PostgreSQL
A recent project required me to use fuzzy string matching, or sound alike matching, in an application that searched a list of names. It turns out there is a contrib module for the PostgreSQL database called fuzzystrmatch that provides several different matching algorithms.
The task at hand involved rewriting a legacy application, originally in PICK, in Ruby on Rails. The PICK application used a soundex search to find names of people that sounded like the search string.
Three algorithms are available as PostgreSQL functions (after installation of the fuzzystrmatch module). They are soundex(), levenshtein(), and metaphone().
Both soundex and metaphone convert a string into character codes. Soundex uses 4 characters and metaphone uses a configurable number of characters. Levenshtein directly compares two strings and returns an integer indicating how well the two strings match.
After some trial and error, I found that metaphone produced better results than soundex. I didn't test the Levenshtein function.
To improve the results, I added a classic substring search using ILIKE. The combination of ILIKE and metaphone gave me a broad, but reasonably accurate fuzzy string search.
The task at hand involved rewriting a legacy application, originally in PICK, in Ruby on Rails. The PICK application used a soundex search to find names of people that sounded like the search string.
Three algorithms are available as PostgreSQL functions (after installation of the fuzzystrmatch module). They are soundex(), levenshtein(), and metaphone().
Both soundex and metaphone convert a string into character codes. Soundex uses 4 characters and metaphone uses a configurable number of characters. Levenshtein directly compares two strings and returns an integer indicating how well the two strings match.
After some trial and error, I found that metaphone produced better results than soundex. I didn't test the Levenshtein function.
To improve the results, I added a classic substring search using ILIKE. The combination of ILIKE and metaphone gave me a broad, but reasonably accurate fuzzy string search.
Friday, February 26, 2010
Linux: fuser to find processes on TCP ports
Note: This is for Linux only. The Mac (BSD) version of fuser does not handle TCP/UDP ports.
Once or twice a year, I run into a problem where a process is using a TCP port and I need to find out which one. I am documenting it here for the next time so I don't have to look it up in man pages.
To see all processes, run fuser as root or with sudo.
To list all processes connected to TCP port 22:
Once or twice a year, I run into a problem where a process is using a TCP port and I need to find out which one. I am documenting it here for the next time so I don't have to look it up in man pages.
To see all processes, run fuser as root or with sudo.
To list all processes connected to TCP port 22:
fuser -n tcp 22
Thursday, January 21, 2010
Rails 2.x scaffolding field types
Dude, where's my CRUD?
One of the powerful features of Rails 1.x was the ability to generate CReate, Update, and Delete (CRUD) admin screens automatically using the scaffolding script built into Rails.
The original scaffolding read the database models and created basic, but usable screens to let you add and edit database records. When the 2.x release of Rails came out, scaffolding lost that power. Now, you have to manually specify each table field and type on the command line when running scaffold. If you don't, the generated screens will be empty.
Worse, a basic reference to all valid field types was missing. Here are all the valid types I have been able to dig up:
string
text (long text, up to 64k, often used for text areas)
datetime
date
integer
binary
boolean
float
decimal (for financial data)
time
timestamp
A mapping of the scaffolding types to data types in corresponding databases can be found on Overooped.
Here is an example of using 2.x scaffolding with data types, run from the Rails application root directory:
ruby script/generate scaffold Modelname name:string title:string employed_on:date remarks:text
Here is an example of using rails 3.x scaffolding with data types, run from the Rails application root directory:
ruby script/rails generate scaffold Modelname name:string title:string employed_on:date remarks:text
Subscribe to:
Posts (Atom)