Saturday, August 29, 2009

Format Multiple Cells in Calc

I was just creating a spreadsheet in OOCalc that required many of the cells to be split in two, and the only way I know how to do so is by merging the other cells so they each take up the same amount of space as two cells. Confused? Sorry :( Anyway, to keep my mouse work to a minimum, I used the "Format Paintbrush" tool to easily merge cells together.

Unfortunately, I had tons of these cells that needed to be merged, and the job would have taken too long because the "Format Paintbrush" tool loses its memory after one use. Basically, you click/highlight the cell/text you want to copy the format of (whether it's bold, italic, a merged cell, etc.), and then you click on the cell/text that you want to receive the format. So I did a quick online search to see if there was any way I could format multiple items at once instead of having to click the original item, click "Format Paintbrush", and then click on the end item, over and over. Using the words "openoffice calc format paintbrush", I got thousands of hits, but thankfully the first one brought me to a helpful webpage over at Solveig Haugland's OO site that addressed the very issue I was looking for.

In short, it seems that all you have to do to retain a certain format in the "Format Paintbrush" for use on multiple items is to double-click the button. At that point, you can click, drag, and highlight as many things as you want to have adopt your desired formatting. To exit out of that mode, all you have to do is click the button a final time. I hope this helps save you time as much as it did I.

Thursday, July 9, 2009

Long Time, No See...And OCR software

So it's been a really long time since my last post...oops ;p Sorry? There's no single explanation to excuse my absence, nor do I care to waste space to detail them at this time, so I'll just get right on with the main topic of this entry: OCR software.

What is OCR? OCR stands for Optical Character Recognition, and what OCR programs do is basically read an image file (like a JPG, PNG, TIF, etc.) and output any text it recognizes. As you can imagine, this greatly reduces the amount of manual labor needed to translate a purely binary file into a text file. And as Linux/BSD enthusiasts know, text files are incredibly easy to slice, dice, maneuver, and just generally utilize for various purposes. For example, even a general computer user could search for instances of a specific term in a text file; such a task, which could ordinarily take minutes, if not hours or days, to complete may take only a few seconds once an image is dealt with by an OCR program. With the aid of such text processing languages as sed, awk, or Perl, the possibilities are almost limitless as to what can be accomplished with a typical text file.

So having established the great advantage(s) test files have over image files, where can you get started? Well, I'm glad you asked! Just for your convenience, I've (lightly) tried out three different OCR programs that are available in Ubuntu's repositories: gocr, ocrad, and (the self-proclaimed "commercial quality") tesseract. These or others should be available in all or most other major distributions' repositories as well. The test image used incorporates various font effects and sizes, making it a good candidate to compare the capabilities of the 'wares. So come on, let's see how they all fare.

The first program I tried was gocr, which can be installed on Ubuntu with the command sudo apt-get install gocr Right out of the box, gocr claims it only supports PBM, PNM, PPM, and PCX image files; however, if you install the package "netpbm", it'll also handle pnm.gz, pnm.bz2, JPG, JPEG (what's the diff?), TIFF, GIF, BMP, PS (single pages only), and EPS files. So unless you have P*M or PCX files lying around your filesystem, you'll probably want to go ahead and install the "netpbm" package (simply replace "gocr" with "netpbm" in the above command. Technically, you can use GIMP to save your image as any of those filetypes, though, so that's an alternative.

At this point, you'll need to have a file of a supported type, and all you have to do to extract the text of a given image file is issue the command gocr -i path/to/image/file -o path/to/desired/output/file Example: gocr -i test.png -o text Keep in mind that Unix/Linux programs don't typically rely on file extensions, the output file in the example is without; if you must, you can name it "text.txt". So what this command does is it takes the input image file "test.png", has gocr process it, and then outputs the file "text". Simple enough, yeah? Unfortunately, gocr isn't the most accurate ever, and I had to hand edit several places in the newly-created text file. By and large, however, the output was the same as the original (which is good!).

The next specimen, ocrad, seems to be more picky about its file formats, and you must provide a PBM, PGM, PNM, or PPM file (GIMP may come in handy here). The command used for ocrad was ocrad path/to/file -o path/to/desired/output/file Example: ocrad test.ppm -vo text -x results In this example, I've also used the -v option (for verbose mode), which was legally combined with the -o option to make -vo, and the -x option (to specify a file in which to place OCR results, which is completely optional). The result was very similar to gocr's, albeit the errors were slightly different.

Finally, we get to tesseract, which was apparently developed by HP between 1985 and 1995. The actual package name is "tesseract-ocr", so that'll need to be used to install it. My experience with this contender started out a bit flaky-like. I tried to use it with a command of the form tesseract test.png text However, I ran into an error: unable to load unicharset file /usr/share/tesseract-ocr/tessdata/eng.unicharset Being the curious type that I can be, I ventured into the stated directory, and I immediately found the reason why the file couldn't be loaded: it didn't exist! Another file with a similar name did, though ("deu.unicharset"). Rather than do a full-blown research on this issue, however, I simply made a symbolic link named "eng.unicharset" to the existing "deu.unicharset" file. That apparently did the trick for that specific error, but when I tried the command again, I was greeted with yet another error that was very similar to the previous one. Since I was feeling kind of lazy (or smart, depending on your view), I just wrote a simple script to automate the task of creating links for the other files tesseract was complaining about: for i in deu.*; do sudo ln -s $i $(echo "$i" | sed 's;deu;eng;'); done And this was executed from within the /usr/share/tesseract-ocr/tessdata directory, of course. Back to the original command I went, which without fail produced yet another error; this time I was informed for the first time that the input file must be either a TIFF or MDI file. At last, I converted the image file into the TIFF format with GIMP, and success—an output file was created. True to its word, indeed the text output was the best of the bunch, in terms of accuracy. There were much fewer mistakes, most of which were made with the smaller text. On the other hand, white space was completely disregarded, so the text was all squished together vertically. Still, it was overall an improvement, and I think I'd recommend tesseract out of all of them. Perhaps there's an even better Linux OCR alternative that's not in Ubuntu's repositories, but for now, tesseract will do it for me and the general user.

Friday, March 28, 2008

OpenOffice.org — Colorful Conditional Formatting in Calc

Sometimes it seems like the best way to highlight data in a spreadsheet is to change its font and/or background color. For instance, the constantly-evolving grade sheet I use to track my school grades has specific cells that tell me whether I'm ahead of, on time, or behind the recommended schedule for a particular class (and by how many weeks I'm ahead or behind). Being ahead is good, so I've chosen green to represent the cell in that instance, being on time is okay, so I've chosen yellow for that, and being behind is horribly bad, so I use red for that. Technically, this color concept came from Excel 2007, and I merely continued using it as I switched to Calc. It's a good thing that the feature was able to translate smoothly from Excel to Calc!
Still, as is common with OpenOffice, this feature is much easier to use in Excel, while still possible in Calc. For the most part, you can rely on Calc to do everything in the translation for you. However, if you don't have an Excel file that you can open in Calc, I'll go ahead and share the code with you.
First, though, let me explain the conditional formatting as it applies to my grade sheet. Three fields are involved: “Week Submitted”, “Week Due”, and “Progress”. “Week Submitted” is simply the week number that I submitted a given assignment—pretty self-explanatory. “Week Due” is the week a particular assignment is due, as predefined by the class's teacher (this is usually a recommendation rather than a requirement, though). Finally, “Progress” is the field with both the formula used to determine whether I'm ahead, on time, or behind, as well as the conditional formatting applied to it.

Creating Styles

Before we get to the actual conditional formatting, let's create three styles—one for each condition. If you need to add colors to OpenOffice's palette, refer to my previous post entitled OpenOffice.org — Adding Colors to OOo's Initial Selection.* Go to “Format” > “Styles and Formatting”. A new window opens up with a few buttons, a drop-down box at the bottom, and a list in the middle. Make sure that “Cell Styles”—the upper-leftmost button in the window—is checked. Right-click the word “Default”, which is located right below the “Cell Styles” button, and select “New”. In the “Font Effects” tab, choose the color you'd like the font to have, and on the “Background” tab, choose the color you'd like the cell to have. If I'm ahead of my class's recommended schedule, both the font and background of the cell will be green; specifically, my font is RGB=0,97,0, and my background is RGB=198,239,206. You may have added those colors previously.
On the “Organizer” tab, give this style a descriptive name; I named it “MS Office Good Green” since the colors can be used to visually indicate anything positive, whether it's when you're ahead of schedule or earning profit. You can repeat this two more times (starting with right-clicking “Default”), pairing the font and background colors I've provided at the bottom of this post, or you can use your own color combinations. Just be sure that one of your two new combinations indicates neutrality and the other negativity.

Applying Conditional Formatting

Now, all you have to do is select the cells that you'd like to apply conditional formatting to, click “Formatting”, and then click “Conditional Formatting”. Here, you can specify up to three separate conditions that each can have their own unique formatting. Thankfully, I only have three separate conditions—“ahead by x weeks”, “on time”, and “behind by x weeks”. I can see this limitation posing a problem if you need more than three, obviously.
You'll notice that there are drop-down boxes that either display the text “Formula is” or “Cell value is”. Unfortunately, Calc doesn't have a straightforward “Cell value contains” or equivalent, as Excel does, which would allow us to simply put in “ahead”, “on time” and “behind” as the cell values. You could technically do “Cell value is equal to ahead”, but then it wouldn't apply the proper formatting because that condition would never be met; I'm never just “ahead”, I'm “ahead by x weeks”, where x can have a value between 1 to infinity, and the s may or may not be there, depending on the typical use in the English language (i.e., 1 week, 2 weeks, etc.). In translating the original Excel file, Calc interpreted “Cell value contains” as “Formula is”, so that's what we'll have to use. In the textbox to the right of “Formula is”, you need to put in the following code:
NOT(ISERROR(SEARCH("on time";I3)))
This code searches for the text “on time” in cell I3, and it will apply the formatting that we'll specify next if it finds “on time”. Remember that I use yellow to visual indicate that I'm on schedule, so we'll define the formatting so that the background of the cell is yellow and the text itself is brownish. Again, these colors are actually taken from Excel, and I've decided to continue with them even as the spreadsheet is now in ODS format.
The drop-down box to the right of “Cell Style” determines the formatting. When Calc interpreted the original file, it added the correct styles simultaneously. You may need to define the styles you want to apply, however. Here is where you use those styles that you previously created. Click the down-arrow and select the style that you'd like to apply for this first condition. Then, repeat these steps for the other two conditions. If all goes well, you should be finished!
To test, type “on time”, “behind”, and “ahead”, one by one, into the conditionally formatted cells. If the cell background and font colors change to the correct ones, you did it! If not, please reread my instructions to see if you missed anything.
As usual, if you use the “Conditional Formatting” feature of OpenOffice fairly often, you can add it to the “Standard” toolbar if you:
  1. Click the down-arrow at the far-right of the “Standard” toolbar (the one with the options “New”, “Open”, “Save”, and others).
  2. Click “Customize Toolbar”.
  3. Ensure “Standard” is selected in the drop-down box to the right of the words “Toolbar”.
  4. Select “Add” on the right.
  5. Find the command you wish to place on the Standard toolbar. In this case, you'd go to “Format” under “Category” and “Conditional Formatting” under “Commands”—the same way you got to Conditional Formatting in the first place. Press “Close”.
  6. At this point, you can drag the newly-added command to where you'd like it, or you can use the up and down arrows to the right of the scrollbox. When you are finished, press “OK”.
Well, I hope you found this post helpful. Until next time :)
*Add the following colors to OOo's palette if you want to use the same ones I do. The format is RGB=Red value, Green value, Blue value (color description). You can enter these values as instructed in my post OpenOffice.org — Adding Colors to OOo's Initial Selection. “ahead” RGB=198,239,206 (light green background) RGB=0,97,0 (dark green text) “on time” RGB=255,235,156 (light yellow background) RGB=156,101,0 (brown text) “behind” RGB=255,199,206 (light red background) RGB=156,0,6 (dark red text)

Friday, March 21, 2008

OpenOffice.org — Full-Page Backgrounds in Writer

I personally have found OpenOffice to be a very competent free and open-source alternative to Microsoft Office. I know that it doesn't have the latest features that MS Office does, necessarily, but it certainly has everything that I'd imagine most people need for basic (and not-so-basic) word processing. Of course, it may have the features you want, but it doesn't always have an intuitive way to use them. For instance, in MS Word 2007, you can easily change the color of a document's background by selecting (or mouse-scrolling to) the “Page Layout” tab and then the “Page Color” button in the “Page Background” group (that may all sound complicated, but it's actually not very). In OpenOffice Writer, however, it's not nearly as simple (or obvious). After much searching, I found a work-around that can achieve the same effect:
  1. Go to Format > Page > the Page tab and change all of the margins to 0”.
  2. Go to the “Background” tab and choose the color you want as the page's background.
  3. Go to the “Borders” tab, turn the borders on for all sides by clicking the second-leftmost button under “Line Arrangement” and “Default” on the left.
  4. Select the border color.
  5. Finally, change the “Spacing to contents” values to the appropriate amount of spacing you want. They take the place of the margins, so if you want the text to be 1 inch from the border, type 1” in each of the textboxes. If “Synchronize” is checked, the value for all four textboxes will change at once, so you only need to enter the value 1” in one of the textboxes.
  6. Click “OK”.
Voilà! Your page is beautiful!
If you use full-page backgrounds in Writer fairly often, you can add “Page Settings” (or pretty much any other feature) to the “Standard” toolbar by adhering to the following instructions:

  1. Click the down-arrow at the far-right of the “Standard” toolbar (the one with the options “New”, “Open”, “Save”, and others).
  2. Click “Customize Toolbar”.
  3. Ensure “Standard” is selected in the drop-down box to the right of the words “Toolbar”.
  4. Select “Add” on the right.
  5. Find the command you wish to place on the Standard toolbar. In this case, you'd go to “Format” under “Category” and “Page Settings” under “Commands”. Press “Close”.
  6. At this point, you can drag the newly-added command to where you'd like it, or you can use the up and down arrows to the right of the scrollbox. When you are finished, press “OK”.

source: <http://homepage.ntlworld.com/pesala/Home/html/watermarks.html#Fills>

Friday, March 14, 2008

OpenOffice.org — Adding Colors to OOo's Initial Selection

When I first started using OpenOffice, I had wanted to change the background of some cells in Calc, but I didn't know how to add my own custom colors to use instead of the standard set of colors that came preinstalled with OpenOffice. Well, if you're in a similar situation, I have good news for you! Yes, you can in fact add your own custom colors to use in OpenOffice :) No longer will you be limited to the existing palette!—all you have to do is:
  1. In any OpenOffice application, go to “Tools” > “Options” > “OpenOffice.org” > “Colors”.
  2. Click “Add” and choose a name for the new color. Note: if a color is selected before you press “Add”, you'll get a message stating that the name already exists. Press “OK” and at the prompt, type in the name of the new color you're going to create. Now you're back on track.
  3. Create the color by changing the Red, Green, and Blue (RGB) or Cyan, Magenta, Yellow, and Key (CMYK) values (you can switch between RGB and CMYK in the drop-down box to the left of “Delete”). Alternatively, for better fine-tuning (including Hue, Saturation, and Brightness settings), you can click “Edit”. Note: if you change the color values in the Edit window, you'll have to apply the changes by pressing “Modify”. Unfortunately, OpenOffice doesn't accept hexadecimal values.
  4. When you're done perfecting your color, simply press “OK”. Caution: if you change an existing color's values and then click “Modify”, the new values will overwrite the old color! Only click “Modify” if you truly want to commit the changes.
Of course, it would be really nice (hint, hint) if OpenOffice allowed you to add custom colors at the same time that you choose the font or background colors, as Microsoft's Office does. Maybe in a future update . . . .

Friday, March 7, 2008

OpenOffice.org — Creating and Setting a Default Template

If you find yourself frequently using the same or similar layout for new files in OpenOffice Writer, you may find it easier to create a new default template that incorporates the changes you usually do by hand, automatically. This is a simple way to save you time, especially for the more sophisticated layouts you may utilize. To change the default template for text documents in Writer, first create a blank document with all of the formatting that you'd like the template to have. Next, go to File > Templates > Save, make sure "My Templates" is selected under Templates > Categories, and then give the template a name under "New template" and press OK. This just saved your template in the default template folder, which you can specify in Tools > Options > OpenOffice.org > Paths. To make this template the default template, go to File > Templates > Organize, double-click "My Templates", right-click the template you just saved, and finally click "Set As Default Template". Click close, and you're done! Every time you open Writer or choose "New" from the toolbar, the new document will automatically use your template. To reset the default template, go to File > Templates > Organize, double-click "My Templates", right-click the template that's currently the default, hover over "Reset Default Template", and choose "Text Document". If you'd like instead to choose which template to use when first creating a new document, simply go to File > New > Templates and Documents, or you can add a button to the Standard toolbar that does the same thing. To do so, click the down arrow on the far right of the Standard toolbar, click “Customize Toolbar”, and under Toolbar Content > Commands, check “New Document from Template”. When you click on this button, you will get to choose which template to use for the new document you create. Keep in mind that all template files (*.ott, etc.) should be closed while you make these changes. Also take note that with the default “New” button, if you make a selection from its down arrow, whichever selection you make will become the new default selection whenever you click the actual “New” button. For example, if I click the down arrow to the right of “New” and select “Spreadsheet”, the “New” button will now show the spreadsheet icon to demonstrate that if I press “New”, a new spreadsheet will be created (rather than a text document, for instance) since I last selected “Spreadsheet” from the down arrow. This information is current, as of OpenOffice.org version 2.3.1 on Windows Vista. To find out which version you're using, go to Help > About OpenOffice.org. Some steps may be different in OpenOffice.org for other operating systems, such as Linux.

Friday, February 29, 2008

Wine

Expanding on the Wine talk, why don't we discuss why Wine is useful. Well, I personally know that it's making my transition much easier. There are tons of free and open-source programs available for Linux, and in this case, that may be working both for and against me. Given the vast amount of Linux software, I'm bound to find a program that's suitably equivalent to the programs I formerly used on Windows; but at the same time, it may take me a while to finally find that program. Until I do find these alternatives, Wine may allow you to use your Windows programs either indefinitely, or until you come across a Linux-based one. Currently, that's what I'm doing with a few programs. As I mentioned before I use both HashX and (to a lesser extent) Easy Duplicate Finder. In addition, I use Bulk Rename Utility, which as the name suggests, allows me to easily rename several files at once. I understand that there's Métamorphose, but I haven't tried that yet. Now I don't necessarily use these just because I'm used to them or haven't found a Linux-equivalent; in some cases, the Linux equivalents are inferior to their Window counterparts. For example, I feel that the free Paint.NET is superior to GIMP, even if GIMP has many more features—mostly it's because Paint.NET has an easier-to-use interface. It has one window, with four others (toolbox, color editor, history, and layer manager) inside it, vs. GIMP's multi-window configuration. But also, tools in Paint.NET are more intuitive, and thus easier, to use. You select a tool from the toolbox—say the selection tool—and the more refined settings for that tool appear on the top toolbar. There, I can easily decide whether to replace an existing selection area with a new one, add a selection to an existing one, subtract a selection area from an existing one, etc. Unfortunately, as the name implies, Paint.NET is reliant upon Microsoft's .NET architecture, which makes it more difficult than usual to port the program to Linux. As of now, all I now is that some have worked on a version of Paint.NET for Linux that uses Mono in place of .NET, with decent, but not preferable, results. As such, I'm stuck with GIMP (in Linux, anyway), and will be for the foreseeable future :( I really hope that GIMP's development team can fix GIMP's interface enough that it's less reliant on menus and generally more like Paint.NET's—for the end-user's sake, you know? So Wine doesn't work with every Windows program out there, but with time and/or luck, your favorite Windows program may work well with it. Wine's website, , maintains a database of programs and their status (i.e., whether they work or not), so you can use that resource to find out whether others have been successful in using a certain program with Wine. Of course, the database may contain outdated information, and also it wouldn't be uncommon for your specific program to not be listed, so you can always try the program yourself. Good luck :)