Sunday, November 16, 2008

Unix sort on multiple columns

=============== THIS IS NOT QUITE RIGHT! LOOK AT UPDATED BLOG POST IN JANUARY 2010 ===============

To sort on multiple columns using the unix sort command, use the -k option multiple times.

IE, to sort first on column 2, then on column 1, use:

sort -k 2 -k 1 inFile > outFile

To sort the columns in numerical order (instead of string order), put an "n" after the column numbers. ie

sort -k 2n -k 1n inFile > outFile

Friday, November 14, 2008

Area Under Curve with Open Office / Excel

If you have the curve points, here's how to find the area under the curve in Open Office or Excel



(Taken from http://people.stfx.ca/bliengme/ExcelTips/AreaUnderCurve.htm )

Wednesday, September 24, 2008

Latex Math Symbols

For some reason nobody ever seems to mention this. But you actually need to include:

\usepackage{amssymb}

inside your latex document to use a lot of the math symbols. (Like \blacksquare )

Otherwise you'll end up getting an "Undefined Control Sequence" error.

Wednesday, September 10, 2008

Calculate Pearson Correlation Coefficient Critical Values

Two-tailed Pearson Correlation Coefficient Critical Values can be found exactly in Open Office or Excel by:

1. Enter the probability you wish into a cell, ie 0.1 or 0.05 (Say cell A1)
2. Enter the number of samples (N) into a cell (Say cell A2)
3. Enter A2 - 2 into a cell to get the degrees of freedom (Say cell A3)
4. Enter =TINV(A1;A3) into cell A4
5. Enter =SQRT(A4^2/(A4^2+A3)) into a cell to find the Critical Value.

This takes advantage of the relationship between the t-test critical value and the Pearson critical value.

t = r / sqrt ( (1-r)^2 / (n-2))

where t is the t-test value, and r is the Pearson.

Monday, September 8, 2008

Changing ownership for a mounted drive

Hmmm. Quick fix I found...after mounting my newly formatted drive, I suddenly did not have permissions to write or execute to it (ownership was only to root).

Changed ownership and got permissions by running:

sudo chown -R km /media/disk

Dunno if I'll have to find some other solution so I don't have to do this every time though.

Sunday, September 7, 2008

Unmounting a busy drive in Unix

When a drive is "busy" and having problems unmounting

sudo umount /media/HD-HSU2/
umount: /media/HD-HSU2: device is busy
umount: /media/HD-HSU2: device is busy

you can find out what's keeping the drive busy by using:

fuser -m /dev/sdc1

This will return with the process number of the process keeping the device busy.

Then use ps (like, a ps -x) to find out what process it really is and kill it.

Tuesday, June 10, 2008

tcpdump for capturing packets and headers

Lots of programs and methods are able to connect to the internet and download webpages. Unfortunately, when you try to do this with your own programs, a ton of stupid webpages have checks that will stop you. They only want their "real users" using regular browsers to download them.

So, I used tcpdump to reverse engineer webpages and figure out exactly how to imitate / pretend to be a regular user,

First, open your browser and clear all cookies, cache, etc. etc. Also, it helps to turn off automatic image loading (Edit -> Preferences -> Content , uncheck "Load images automatically" in Firefox) Then, from the command line, type:

sudo tcpdump -s 6000 -i eth1 -A -w outputFile

Now, visit the webpage with your browser, and do whatever you want your program to do (login, load certain pages, etc. etc.)

tcpdump will be listening and capturing exactly how your browser and this webpage interacted.

Ctrl+C the tcpdump when you're done. Then type:

tcpdump -A -r outputFile > outputFile2

Now you can vi outputFile2 and view how your browser and the webpage interacted.

Your program may need to imitate such things as the User-Agent, Referrer, etc. Look out for mysterious redirects used to fool programs, and make sure to capture all cookies set.

Saturday, June 7, 2008

Shell script for loops

To do a list of commands over and over for each subdirectory, type:

sh
for i in `ls -d ./*/`
do
cd $i
(list of commands like mv ../a . or perl blog.pl *blog*.html etc.)
cd ..
done
exit

List only directories in Unix

Much easier way to list only the directories:

ls -d ./*/

Friday, June 6, 2008

Parenthesized search and replace in vi

Parenthesized portions of a regular expression match can be referred to later!

ie \(match1\).*\(match2\) / \1 \2/

\1 will be replaced with whatever match1 held, and \2 will be replaced with whatever match2 held.

ie

:%s/.* .* \(.*\)/\1/

will remove the first two columns just like in the below blog

vi search and replace with regular expressions

in vi, to search and replace:

:%s/search_phrase/replace_phrase/

with regular expressions:

.* matches one or more characters

so to get rid of the first two columns, separated by spaces

ie

2395 4586 3496
146434 6 3945

:%s/.* .* //

so it becomes:

3496
3945

or you could also do
:%s/[0-9]* [0-9]* //

Thursday, June 5, 2008

Reformatting drives in unix

run:

sudo gparted

(sudo apt-get install gparted if it's not already installed in ubuntu)

FAT32 has been working pretty well for R/W access in both windows and unix. Though it has a 4GB filesize limit, and ~20,000 to 50,000 files per folder limit as well.

Wednesday, June 4, 2008

Execute shell commands from java

(Probably only works on unix)

To just execute without worrying about returned output:

Runtime.getRuntime().exec(new String[] { "sh", "-c", "mv a b" });

(This example would run "mv a b" in the shell)

To execute and retreive the returned output:

Runtime rt = Runtime.getRuntime();
Process p = rt.exec((new String[] { "sh", "-c", "ls *.tmp" }));
try{
p.waitFor();
InputStream is = p.getInputStream();
java.io.DataInputStream din = new java.io.DataInputStream(is);
} catch(Exception e){}

(din will have the returned output of running "ls *.tmp" in the shell)

Monday, June 2, 2008

What line number you're on in vi

To find out what line you're currently at in a file in vi, type

Ctrl + g

This will also give you some other helpful info, like what % of the way through the file you're at and what column you're on.

vi commands - insert at beginning or end of every line

:%s/^/hello/
inserts "hello" at the beginning of every line

:%s/$/world/
inserts "world" at the end of every line

Listing directory sizes in Unix

* makes it list the files in the current directory.

du -sk * | sort -n

Saturday, May 17, 2008

Books to Check Out

Reminder list to myself of books or movies I want to check out in the future.

Books:

"Disney Wars" by James B. Stewart

Round a double to the nearest tenth

Rounding a double to the nearest tenth in java (easy enough to change to nearest hundredth, thousandth, etc.)


int front = (int) Math.floor(originalDouble);
double end = originalDouble - front;
end *= 10;
double finalNum = front + (Math.round(end) / 10D);

.tar.gz

Create a .tar.gz file

     for a file
     tar -czvf name_of_your_archive.tar.gz /path/to.file

     or for a directory
     tar -pczvf name_of_your_archive.tar.gz /path/to/directory


Unpack a .tar.gz file

     tar -xzvf name_of_archive.tar.gz

Java MaxHeap Class

This was taken and modified from http://www.cs.usfca.edu/galles/cs245/lecture/MinHeap.java.html

I should change this to use be more generic sometime.

Right now it's assuming I'm adding Objects of class Link.

Link implements Comparable, and has a constructor where you pass it one String ("max" or "min"). If "max", you assign it the maximum possible value. If "min", you assign it the minimum (for use in MinHeap if you want to create one)


public class MaxHeap {
private Link[] Heap;
private int maxsize;
private int size;

public MaxHeap() { this(100); }

public MaxHeap(int max) {
maxsize = max;
Heap = new Link[maxsize];
size = 0 ;
Heap[0] = new Link("max");
}

private int leftchild(int pos) { return 2*pos; }

private int rightchild(int pos) { return 2*pos + 1; }

private int parent(int pos) { return pos / 2; }

private boolean isleaf(int pos) { return ((pos > size/2) && (pos <= size)); }

private void swap(int pos1, int pos2) {
Link tmp;

tmp = Heap[pos1];
Heap[pos1] = Heap[pos2];
Heap[pos2] = tmp;
}

public void insert(Link elem) {
size++;
if(size >= maxsize-1)
doubleSize();
Heap[size] = elem;
int current = size;

while (Heap[current].compareTo(Heap[parent(current)]) > 0) {
swap(current, parent(current));
current = parent(current);
}
}

public void doubleSize()
{
maxsize = maxsize * 2;
Link[] newHeap = new Link[maxsize];

for(int i=0; i <=size; i++)
newHeap[i] = Heap[i];

Heap = newHeap;
}

public void print() {
int i;
for (i=1; i<=size;i++)
System.out.print(Heap[i] + " ");
System.out.println();
}

public Link removeMax() {
swap(1,size);
size--;
if (size != 0)
pushdown(1);
return Heap[size+1];
}

private void pushdown(int position) {
int largestchild;
while (!isleaf(position)) {
largestchild = leftchild(position);
if ((largestchild < size) && (Heap[largestchild].compareTo(Heap[largestchild+1]) <= 0))
largestchild = largestchild + 1;
if (Heap[position].compareTo(Heap[largestchild]) >= 1) return;
swap(position,largestchild);
position = largestchild;
}
}

public int size() { return size; }
}