More Than Four

Areas of Interest, as counted by my cat

Page 2 of 13

Learning Git – Appendix D: What’s the DIFF?

Ray is writing the next great American detective novel, and we’ve started by creating a remote BitBucket Git repository to manage his changes.

Ray clones the repository:

$ git clone https://ray-chandler@bitbucket.org/ray-chandler/noir.git

Consider that Ray now has a tracking branch called “master”; a remote reference called “origin”; and a remote-tracking branch “origin/master”.

He has many possible versions of his novel:

  • The working tree contains Ray’s manuscript with any current un-staged, un-committed changes;
  • The staging area may contain a copy that was saved for inclusion in the next commit (via add);
  • HEAD is the most recent commit in the current branch (i.e. “master”);
  • The remote-tracking branch “origin/master” may also have some un-merged differences from the last fetch operation;
  • The “master” branch in the remote repository on the server may have some un-fetched changes recently checked in by Ray’s editor.

Wow, that’s five possible versions.

Scene: Ray sips at a glass of bourbon, and types furiously, making changes to Chapter 1. He saves his work periodically.

While Ray is accessing his muse, let’s learn about Git’s diff tool:

Using DIFF

The Git diff command is a tool we can use to compare different versions of source files:

$ git diff <target> <source> # 

This produces a report of changes needed to make “target” look like “source”. There are some more common, special cases:

To make <target> look like the working tree:

$ git diff          # target = staging area
$ git diff HEAD     # target = HEAD     
$ git diff <target> 

To make <target> look like the staging area:

$ git diff --staged          # target = HEAD
$ git diff --staged HEAD     # target = HEAD     
$ git diff --staged <target>

What would git commit -a do? Find out with:

$ git diff --staged HEAD

What would git commit do? Find out with:

$ git diff --staged

Examples

Consider the following sequence:

echo "This line is pushed to remote/master." > source.txt
git commit -a -m "Step1"
git push origin
git switch -c secundo
echo "This line is committed to secundo." > source.txt
git commit -a -m "Step2"
git switch master
echo "This line is committed to master." > source.txt
git commit -a -m "Step3"
echo "This line is in the staging area." > source.txt
git add source.txt
echo "This line is in the working tree." > source.txt

We’ve now got different versions of source.txt in all possible locations. Let’s find out the differences:

$ git diff
diff --git a/source.txt b/source.txt
index 16f0b21..c260d9d 100644
--- a/source.txt
+++ b/source.txt
@@ -1 +1 @@
-This line is in the staging area.
+This line is edited in the working tree.

Yup, that describes how to update the staging area to match the version in the working tree. (From here on out, I’ll omit the first few lines of the diff output).

$ git diff --staged origin/master
:
-This line is pushed to remote/master.
+This line is in the staging area.
$ git diff secundo master
:
-This line is committed to secundo.
+This line is committed to master.

Exercise for the student: Try out the other variations.

Back to that detective story

Cut scene: Editors office. Penny White is sitting at her computer.

Penny cloned the repository yesterday, but she’s pretty sure they’ll be some updates from Ray on the server:

$ git fetch origin
:
Unpacking objects: 100% (6/6), 738 bytes | 22.00 KiB/s, done.
From https://bitbucket.org/ray-chandler/noir
   14c6528..713d32a  master -> origin/master
$ git status
On branch master
Your branch is behind 'origin/master' by 2 commits, and can be fast-forwarded.
 (use "git pull" to update your local branch)
nothing to commit, working tree clean

Penny could merge at this point, but before she does, she’d like to see what the changes are.

From our experiments above, we know that in order to see what will change during the merge, Penny will need to request a diff using “master” as the <target> and “origin/master” as the <source>:

$ git diff master origin/master
diff --git a/detective.txt b/detective.txt
index df79c59..8616f4d 100644
--- a/detective.txt
+++ b/detective.txt
@@ -1,7 +1,9 @@
 Chapter 1

 There was a knock at the door. I quickly hid the comic book under the WIRED magazine
-and took out some official looking papers and scattered them about the desk.
+on the desk and took out some official looking papers and scattered them about.

 "Jess!" I yelled.

+The door opened. The siloette blocking the light from the door was a dame,
+but it wasn't Jess.

Ray has been busy. The first thing Penny is going to do after merging is correct the spelling of “silhouette”.

A Graphical Diff

Penny is not a fan of the text output from diff, so she has read up on the difftool command. It is essentially identical to diff, except Git will launch your preferred graphical utility to display the diff information.

If you’re interacting with source code from inside an IDE that includes Git integration, this probably won’t be something you’ll need to do. But it helps to understand what is going on behind the scenes.

I recommend meld, as it is reviewed positively and has versions for both Windows and Linux: https://meldmerge.org/

After we’ve installed it, we can enable it in Git by adding the following lines to .gitconfig:

[diff]
    tool = meld
[difftool "meld"]
    path = meld    ; Windows: c:\\Program Files (x86)\\Meld\Meld.exe
[difftool]
    prompt=false

[merge]
    tool = meld
[mergetool "meld"]
    path = meld   ; Windows: c:\\Program Files (x86)\\Meld\Meld.exe
[mergetool]
    keepBackup = false

Penny views the changes using Meld:

$ git difftool master origin/master
Using Meld to view the diff

Meld presents each changed file in a separate window, opening them in sequence as the previous one is closed. For many files, this can get clumsy.

An alternative would be to generate a list of files with differences, then launch the difftool for each file on your own schedule:

$ git diff --compact-summary master origin/master
 detective.txt   | 7 ++-----
 references.txt  | 1 -
 2 files changed, 2 insertions(+), 6 deletions(-)
$ git difftool master:detective.txt origin/master:detective.txt

That last command demonstrates how Penny would specify a single file, using the “branch_path:file_spec” notation.

Further reading:

That’s all for this Appendix. Go back to the article here.

DVD ripping with Handbrake on Linux

We’re enjoying our new Synology NAS, and leveraging DLNA to access music and video from any client on our network, including iPad, Roku, and desktop devices. Audio CD ripping on Linux Mint isn’t a problem, but DVDs are more complicated.

I’m using Handbrake, because it is open source, free, fast, and very configurable. Also, it is available for both Windows and Linux. On the downside, it is not newbie-friendly; not a point-and-click app. “With great flexibility comes a learning curve”. Hence, my decision to write up my process so far.

Installing Handbrake

In Linux Mint, Handbrake is available in the Software Manager. At time of writing, there are three options available. I chose the GUI version that is not flatpak.

Installation was uneventful. The UI is complicated, at first blush, and I’m not qualified to explain it. I will document my steps to rip a DVD.

About commercial DVDs

A DVD is essentially a file system with directories (such as VIDEO_TS) containing files, including menus, videos, FBI warnings, trailers, images, etc. Depending on the DVD, there may be multiple video files for different aspect ratios. Languages, commentaries, and captions seem to be encapsulated in the video files themselves, although there may be multiple video files for different aspect ratios or edits such as Director’s Cut, alternate endings, etc.

Choosing a format (I)

It is possible to create a single .ISO file containing a disc image of the DVD, and copy it to the NAS. There are clients (e.g. VLC) that can open the .ISO and present the disc menu interactively, just like a DVD player. That’s one option. On the downside, .ISO images take up storage space, and you are limited in how to view them over the network. Given our DLNA ecosystem, I wanted to explore extracting the video files into a format that was more compact and streaming-friendly.

There are more formats than you can shake an empty DVD case at, each with advantages and disadvantages. More on this later.

Loading a DVD into Handbrake

Insert the DVD into your drive, and close any application that tries to play it automatically. I chose a commercial release of the film “Galaxy Quest” because they’ll be some extra bonus features and soundtracks to review.

Run Handbrake (the icon is under the Sound & Video menu) and scan the disc for source files. I found the quickest way to scan the disc in Handbrake is to use the File menu, there’s an entry for the optical disc drive:

Handbrake scans the disc and identifies a list of 25 possible source files.

By default, the first file is selected.

Exploring the DVD files

Note that at first glance there is no information about what these source files are, except for their sequence number and their duration. There’s no way to view any DVD “menu” within Handbrake, although Handbrake will generate a set of preview images from each source file to help us identify.

Here’s what I determined from reviewing the preview images for each title:

 1  1:42:00   Feature Presentation
 2    10:07   On Location In Space - Making Of
 3     2:15   unknown sequence 
 4     1:47   unknown sequence 
 5      :40   unknown sequence
 6      :52   unknown sequence
 7     2:02   FX test sequence?
 8     1:30   unknown sequence 
 9      :47   unknown sequence 
10     9:47   unknown sequence
11     1:54   unknown sequence
12     1:12   Trailer #1
13     2:20   Trailer #2
14     1:50   Trailer #3
15     2:09   Tim Allen talking
16      :42   Sigourney Weaver talking
17      :46   Alan Rickman talking
18      :21   Tony Shaloub talking
19      :34   Sam and Daryl talking
20      :21   Sam and Daryl talking
21     1:21   talking head
22      :25   talking head
23      :35   talking head
24      :13   talking head
25      :24   Dreamworks Logo

Handbrake can extract any one of these 25 source files and convert to a stand-alone video file. Galaxy Quest is an “interesting” example, in that the extras aren’t obvious. The actual DVD menu may help decipher this list. Handbrake can’t play the DVD menu, but VLC can. (Or I could put the disc in our DVD Player.)

Looking at the DVD menu in VLC allowed me to compile:

PLAY MOVIE 
SCENE INDEX
SPECIAL FEATURES
    On Location In Space (10:07)
    From The Cutting Room Floor
        PLAY ALL (9:47)
        1. Tech Talk with Sergeant Chen (2:15)
        2. Alex tours his "personalized" quarters (1:47)
        3. A running spat between old flames (0:40)
        4. Guy gets attacked (0:52)
        5. Alex's motivational speech (2:02)
        6. Gwen saves the day (1:30)
        7. The Crew vs. Sarris (0:47)
    Theatrical Trailer (1:54)
    Sneak Preview Trailers
        Chicken Run
        El Dorado
        Road Trip
    Cast and Crew
    Production Notes
SUBTITLES
    English Only Captions ON/OFF
AUDIO
    5.1 ENGLISH DOLBY DIGITAL
    2.0 ENGLISH DOLBY SURROUND
    2.0 THERMIAN DOLBY SURROUND
OMEGA 13

We can now cross-reference this and update that source file list from Handbrake:

 1  1:42:00   Galaxy Quest
 2    10:07   On Location In Space - Making Of
 3     2:15   Deleted Scene - Tech Talk
 4     1:47   Deleted Scene - Alex's quarters
 5      :40   Deleted Scene - Old flames
 6      :52   Deleted Scene - Guy gets attacked
 7     2:02   Deleted Scene - Alex's speech
 8     1:30   Deleted Scene - Gwen saves the day
 9      :47   Deleted Scene - Crew vs Sarris
10     9:47   All Deleted Scenes (alternative)
11     1:54   Theatrical Trailer
12     1:12   Trailer #1 - Chicken Run
13     2:20   Trailer #2 - El Dorado
14     1:50   Trailer #3 - Road Trip
15     2:09   Tim Allen talking
16      :42   Sigourney Weaver talking
17      :46   Alan Rickman talking
18      :21   Tony Shaloub talking
19      :34   Sam and Daryl talking
20      :21   Sam and Daryl talking
21     1:21   Unknown talking head
22      :25   Unknown talking head
23      :35   Unknown talking head
24      :13   Unknown talking head
25      :24   Dreamworks Logo

My guess is that the 15 – 24 “talking heads” clips are somehow part of the “Omega 13” bonus feature, but I couldn’t figure out how to play them from the DVD menu.

Now we have enough information to start the extraction process.

Extracting/Ripping a source file

Let’s start with the Feature Presentation.

Select Title 1 in the Title: dropdown if it isn’t already selected. Note that there are 21 chapters in this source file.

Choosing a format (I)

We need something compatible with common DLNA clients that can retain the multiple audio tracks and subtitle captions; and keep the original DVD image resolution (or close to it). We don’t need HD video, 48kHz sound or 5.1 surround sound. MP4 or MKV seem like a good option.

From the Preset: dropdown, choose Official > Matroska > H.265 MKV 720p30

There are a series of “tab buttons” that select different pages in the UI. We can skip the next two and go directly to Video.

Selecting Video conversion options

The output format preset I’ve recommended here is definitely not the fastest to convert, at least as configured by default. Alternatively I’ve had good results using a different preset, Official > General > Fast 720p30 which is .M4V, and given the debate here, I don’t want to recommend that unilaterally. But it seems faster to process; and the output looks fine; and it is definitely not limited to just Apple clients.

I compared processing time and output file size, limiting the run to just chapters 1-3 of the Galaxy Quest feature presentation:

Fast 720p30 
Render Time:  4:48
Size: 123.5 MB

Matroska H.265 MKV 720p30  
Render Time: 22:04
Size: 108.3 MB

Playing back each of the converted files, I honestly couldn’t tell a difference. They both looked like “perfect” DVD quality rips, and captions and audio soundtrack selections were available.

Hence my personal choice of “Fast 720p30” over the “Matroska”.

That said, just looking at all the options under the Video tab, clearly there is a lot of scope for tweaking; trading off conversion speed for quality, etc. Ideally we want to retain as much of the original quality of the source as possible. However, I don’t recommend changing anything here until we know what the consequences are. Let’s stick with the preset defaults for now.

Next: Audio.

Selecting Audio Tracks

Handbrake makes a default selection but I recommend clicking on + Add All and reviewing.

Normally these are obvious, such as Alternative language dialog, or a Director’s Commentary, etc. Galaxy Quest is a bit of an outlier in that, according to the DVD Menu options (see previous) there are two English-language soundtracks plus an Alien “Thermian” language soundtrack (ha!). I think it would be nice to have two audio tracks available, English and Thermian, and I’d like to alter the default name so that we can tell which is which.

We can remove the 5.1 track because we’ll take our English soundtrack from the 2.0. (5.1 would be fine if it were the only option – but we’ll be converting to Stereo so 2.0 is fine and probably easier for the conversion.)

If we click on the “edit pencil” icon, we can change the name of subtitle Track 3:

Choosing Captions/Subtitles

Next, we select Subtitles. Again, click on + Add All to review the available options:

Why are there two options here? It’s complicated and I don’t fully understand it. Succinctly put:

the closed captions is a data packet inside the mpeg2 video stream and subtitles is a separate stream inside the vob file

“baldrick”, Videohelp forum

Until I have a better handle on which type is best suited to our needs, I retain both to make them available for clients that support them.

Reading List: Handbrake Subtitles: a Complete Guide for First Time Users

VOBSUB are bitmapped overlays, whereas CC608 are text rendered by the player. Probably CC are better, if available, but so far I’ve retained both as available.

Some clients don’t support optional closed caption display, and if we had no other choice, we could choose one of these two subtitle tracks and “burn” them into the rendered video. We wouldn’t be able to turn them off, but they would be displayed by lesser-capable video players. We’d do this by clicking on the “Edit Pencil” against one of the tracks, and changing the option:

(Note to self: Try [x] Set Default Track to see if captions are displayed by default, even by “lesser capable” video player clients.)

Enter the Metadata

Now we select Tags, and change the metadata to suit. At this point, I also change the Save As: output file name that will be created for this Title:

Batch Processing

At this point, we can start the extraction/conversion process immediately, or we can put the job onto a queue and define the next title. Let’s do that: Click on Add To Queue in the toolbar at the top of the window.

The Queue (1) now has one job

Let’s also add the “Making Of” extra as a second job to the queue. We can do this quickly:

  • Select Title: 2 – 00h10m07s which we identified above as being the documentary featurette “On Location In Space”;
  • Leave the Preset unchanged;
  • Click on Audio; +Add All (observe: only one track; leave it as-is);
  • Click on Subtitles; +Add All (observe: no subtitles available (typical for a bonus feature);
  • Click on Tags; enter metadata and set output file name;
  • Click Add To Queue

Review The Queue

We can view and edit the queue by clicking on the Queue (2) button in the toolbar:

Now we can process the entire queue by pressing Start.

Help! My Computer Is Melting!

You’ll notice that Handbrake is making good use of all cores in your CPU and your cooling fan has cranked up to max.

Resource Monitor shows all CPU cores are maxed? YOU SHOULD BE FINE. It’s what they’re for.

However, if it absolutely positively DOESN’T have to be completed in the shortest possible time, you can tell Handbrake to cool its heels and relax its grip somewhat.

Install cpulimit

Linux Mint has the command-line tool cpulimit in the Software Manager:

After you’ve installed it, I recommend editing the command-line of the Handbrake launch menu item, from:

ghb %f

To:

cpulimit -l 400 ghb %f

The -l option specifies a percentage of CPU resources to the application. The trick here is that 100 = 100% of one CPU core. My system has effectively 8 cores, so “400” means that Handbrake (ghb) will only use 50% of the total CPU core resources. You’ll need to tune that number to suit your system.

Here is my system ripping a DVD:

Ahh… still working quite hard but not red-lining

And now, we wait. <skeleton.png>

Reviewing The Results

The Audio Tracks are available, with our naming:

The subtitle tracks are available:

Now you can see why specifying our own label against the audio and subtitle tracks can be beneficial. Track 1 is actually “English (Wide Screen) [VOBSUB]”; and Track 2 is “English, Closed Caption [CC608]”. They contain the same captions but they are rendered differently (how they are rendered could be player-dependent).

Track 3 adds Chapter identification (e.g. “Chapter 2”) at the bottom of the window. I need to investigate what option controls this.

KdenLive notes

KDenLive has been my application of choice for video editing for the last 12 months or so. I use the “standalone” Windows installation. Each time a new version is released (20.04, 20.08) I’ve had this error come up when I try to render a video:

this application failed to start: No QT platform plugin could be initialized

MELT

I’ve looked it up twice now, and for future reference, here is what I needed to do both times: Fix the paths in environment settings, because they retain the location of the previous installation:

These paths have been adjusted to match the new install for 20.08

WordPress: Unable to Select and Crop on IIS?

I was setting up a local dev instance of WordPress for one of my blogs, using IIS and PHP 7. For some reason, the Theme “Customize” feature was giving me an odd error during the image selection process. I’d select an image from the Media Library (or uploaded) and click on “Select and Crop” and then I’d get this:

imageThat says “image crop area preview. Requires mouse interaction”.

It’s the alt text for the image tag. The src=”{{data.url}}” can’t be resolved. (That’s taken out of the view page source). For some reason, that placeholder didn’t get replaced.

I don’t know why it doesn’t work, but I can tell you that after some googling, I found this discussion in which a URL rewrite rule is identified as a potential root cause.

You can find the rule under Computer Management > Services and Applications > Internet Information Services and remember to expand the site directory tree until you locate the WordPress home folder:

image

That might appear to work but, for me, had the side-effect of preventing the Categories and Archive date links from working correctly. I think the rule is a necessary component. I don’t recommend removing it.

Fixing another problem resolves the issue

I no longer experience the behavior described above, and I believe it is due to fixing a separate but related problem:

In my “clean room” test site, http://localhost/wordpress/lab1, I have been attempting to get easy paste image from screen snip working smoothly. It’s been messy, partly due to the problem described above. Eventually I got to a point where I’d save my post and view it, and see this:

The pasted image would not be displayed. The underlying HTML of the page looks correct:

<img src="http://localhost/wordpress/lab1/wp-content/uploads/2020/08/image.png" alt="Thing">

…which is correct. The image file is genuinely in that location. Okay, so, let’s try that URL explicitly in the browser’s address bar:

Alternatively, we can look in the log file at C:\inetpub\logs\LogFiles\W3SCV1\* and see:

2020-08-27 00:02:02 ::1 GET /wordpress/lab1/wp-content/uploads/2020/08/image.png - 80 - ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/84.0.4147.135+Safari/537.36 http://localhost/wordpress/lab1/wp-admin/post.php?post=28&action=edit 500 50 5 0

Googling “WordPress IIS 500 URL rewrite module error” led me to this page:

I will paraphrase here because it is interesting and it could vanish from the web in the future:

IIS WordPress images 500 error

You are probably having the dreaded “broken images” problem while running WordPress under IIS. You see 500 errors logged when it is supposed to be pushing out images.

This is all about fixing permissions so that media files uploaded via WordPress will properly render. The 500 error is thrown when your server has the wrong file permissions. Also called HTTP Error 500.50 – URL Rewrite Module Error when you have Detailed errors on.

The problem is caused because PHP first uploads the document to a temporary directory (by default it is C:\Windows\Temp”) and then moves it from that directory to the actual destination /blog/wp-content/uploads subdirectory.

Because IIS does not have any permissions to C:\Windows\Temp, when the file is uploaded and moved, the file inherits NO permissions. So when IIS tries to server that file from the uploads folder, it throws error 500 which is actually a permissions error.

SOLUTION: Grant “modify” permissions to IUSR and <server>\IIS_USRS on C:\Windows\Temp

Once that is done, files uploaded and moved will have the correct permissions, and will be served without error.

If you have already uploaded files, and are getting the dreaded broken image issue, then go to /blog/wp-content/uploads and replace/update the permissions to add access for the two user accounts noted above.

Alternatively, move the PHP temporary folder elsewhere.

From the PHP.ini file:

[WebPIChanges]
error_log=C:\WINDOWS\temp\PHP72x64_errors.log
upload_tmp_dir=C:\WINDOWS\temp
session.save_path=C:\WINDOWS\temp

I made the permissions change shown above, adjusted the PHP.ini so that upload_tmp_dir was C:\Temp and applied the permission change to that directory. After a quick web site Stop/Restart (just to be sure), everything worked as expected on my local test instance.

Migrating to WordPress 5.4 from BlogEngine.NET

Lisa and I are probably the last people on the planet to still be using BlogEngine.NET for our blogs. Hey, if it works… don’t upgrade it.

Many other respected folks out there have described their process of moving their blog to WordPress, and I’m indebted to them. In this post, I’ll add my notes on the whole experience. Maybe someone out there will find it interesting and helpful.

Overview

Here’s the basic process:

  • Export your posts from BlogEngine.NET admin as BlogML
  • Save your image and attachment files
  • Create a new (empty) WordPress site
  • upload the image and attachment files to the /wp-content/upload folder
  • Obtain and install the BlogML Importer plugin for WordPress
  • Perform some simple text find-and-replace on the BlogML file
  • Tools->Import the BlogML file

In Practice

I hit some potholes… The image link replacements were all straight-forward but required some planning on how the sites were going to be switched. I definitely wanted to keep the same name of the blog site directory (I think it is no secret that it was and is “morethanfour”.)

So the change is minimal – replacing the BlogEngine.NET-specific “/image.axd?picture=” hrefs with “/wp-content/uploads/” path.

It was more complicated than that due to my first importing into a local web root and reviewing the format of the imported posts, and fixing a few strange glitches (due to non-breaking spaces).

Once I was comfortable with the replacements and the final appearance of the imported posts, it was time to deploy to production. For me, this involved the following steps:

  • remove the ASP.NET application on the existing BlogEngine.NET site
  • rename the blog root from “morethanfour” to “old-blog-backup”
  • re-create the “morethanfour” directory
  • upload the WordPress 5.4 install files
  • create the associated blog-specific MySQL database and user
  • ensure that the server was running PHP 5.6 (or later, was not the default!)
  • run the WordPress install process
  • copy the image files (previously saved) into /wp-content/upload
  • copy the blogml-importer plugin into /wp-content/plugins (see below)
  • install/activate the plugin inside the WordPress Administration
  • Go to Tools > Import and run the BlogML Import (which is now available)
  • Follow the prompts

And that is basically it. The final results look great.

Concerning that blogml-importer plugin

The original site with the BlogML Import plugin is no longer available, although other locations have hosted it, including the one I eventually downloaded it from: http://pluginsroom.com/plugins/blogml-importer

The zip archive contains two files, blogml-importer.php and Xpath.class.php, and the second one is not compatible with PHP 5.6 and later. I hacked the Xpath.class.php until I could install the plugin into WordPress without errors.

I have attached my 5.6/7.2 compatible version, zipped, here:

Most of the other potholes aren’t really worth discussing because they were peculiar to differences in PHP version between my hosting provider and my own local (laptop) development site.

This post is not authenticated

Recently I was considering the problem of deep-faked videos and photo-shopped images. The use case I was thinking of was this: Consider a news organization plays an edited clip of a video that appears to show someone saying or doing something that they didn’t do or say.

How to counter this? Perhaps the party in question could release the original, full video and use some mechanism that verifies that it is “official” as opposed to the edited version that would lack such a mechanism.

I believe that the standard technique would be “RSA digital signing” using asymmetric public key cryptography.

User Gilles ‘SO- stop being evil’ on Stack Overflow describes it well:

To use RSA for signing, Alice takes a hash of the message, encrypts the hash using her own private key, and appends the result (this is the signature) to the message. [..] Bob can decrypt the signature using Alice’s public key and see if [his hash of the message] matches. If it does, it must have been encrypted using Alice’s private key, which only she has, so it must have come from Alice.

Giles ‘so stop being evil’ on stack overflow

I also found this article:
A Method for verifying integrity & authenticating digital media

Ted Roche suggests taking a look at Web of Trust, which is interesting.

But perhaps this is all a waste of time. With respect to the news media, we live in a world now where declarations can be made without verification. “Proof” is no longer important. Someone can edit a video or document (whether derived from authentic sources or not) and state whatever the fuck they want to about it, and it will have an impact, potentially damaging, and it won’t matter if there’s an RSA signature associated with it or not.

We need to work on making Proof a thing.

Detecting overlapping date ranges

Consider an Event table, with columns A and B defining the start and end of the period of time over which the event occurred.
Consider also a random period of time defined from S to E.
Now imagine that you want to select all event rows that overlap that specific period of time.

By overlap, we mean that case 1,2,4, and 6 should be included, but 3 and 5 should not:

image

The first solution that might occur to us is this one:

select * from Event
where B between S and E or A between S and E

At first glance it looks clear and elegant, but unfortunately it doesn’t handle case #6.
Somewhat less intuitive but still elegant and also complete is:

select * from Event
where B >= S and A <= E

AES Encryption between Informatica and PostgreSQL

We encountered a situation where encrypted data exported from Informatica could not be decrypted by PostgreSQL, once we’d loaded it into a table.

If you found this page looking for a solution, save your time. I don’t have a solution, I just wanted to make some notes.

How does Informatica encrypt data?

According to this knowledgebase article:

AES_DECRYPT needs binary data as input. Hence when we first encrypt the data using the AES_ENCRYPT function, the data transformed by this function has to be encoded in base64 format. This transformed data can then be used as an input to the AES_DECRYPT function.

In other words:

CIPHERTEXT = Base64_Encode( AES_Encrypt( PLAINTTEXT ))

and

PLAINTEXT = AES_Decrypt( Base64_Decode( CIPHERTEXT ))

According to this thread:

[informatica’s] AES_ENCRYPT function uses the combination as follows which is expressed in terms of functions in DBMS_CRYPTO package of Oracle: DBMS_CRYPTO.ENCRYPT_AES128 + DBMS_CRYPTO.CHAIN_CBC + DBMS_CRYPTO.PAD_ZERO

From Oracle’s documentation, we get:

DBMS_CRYPTO.ENCRYPT_AES128 Advanced Encryption Standard. Block cipher. Uses 128-bit key size.
DBMS_CRYPTO.CHAIN_CBC Cipher Block Chaining. Plaintext is XORed with the previous ciphertext block before it is encrypted.
DBMS_CRYPTO.PAD_ZERO Provides padding consisting of zeroes

So far, so good.

Ensure the pgcrypto extension is installed in PostgreSQL:

In psql:

=> \dx
hstore | 1.4 | public | data type for storing sets of (key, value) pairs
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language

=> create extension if not exists pgcrypto;
CREATE EXTENSION

=> \dx
hstore | 1.4 | public | data type for storing sets of (key, value) pairs
pgcrypto | 1.3 | public | cryptographic functions
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language

Looking at the docs for the raw base functions in pgcrypto, it suggests that the default settings for encrypt/descrypt will use MODE=CBC and PADDING=PKCS (any length) which suggests zero padding but that’s a guess. So I think it should match Informatica, at least according to the docs.
See https://www.postgresql.org/docs/9.5/pgcrypto.html

And yet, using the following test code yields an error:

=> select pgp_sym_decrypt( decode('BI830ZHd2kimJTfMD8E07mSTa8HDUbU/lbKFUDl73FrqI6uwAH7oCwPRGjxUkUZw','base64'),
'Abc123Abc123Abc123',
'cipher-algo=aes128' );

ERROR: Wrong key or corrupt data

In this example, I’ve replaced the key text with a random string value – we didn’t actually use that value.

I am pretty confident that the encryption key is correct; less confident that the cipher text is not corrupt, but in simple tests where I DO have confidence, we still get the same error. Where does this leave us? With data we can’t decrypt, unless we can figure out the difference.

Of course, we can encrypt and decrypt values successfully if we stay in the DB domain and use only the PostgreSQL pgcrypto API to perform the encryption. It’s only data coming from Informatica that is giving us trouble.

Assuming the process that is extracting data from Informatica is not corrupting the data, it has to be some difference between the two encryption algorithms.

I’ll continue to research this and if we find a solution, update this post, of course.

Bi-directional Host-only networking with Virtualbox

I thought I already had a post about this, but I can’t find it, so where we go again.

It has been a while since I set up a guest OS and configured it for networking between Host and Guest. Of course, we also want the Guest to be able to see the Internet…

Environment:

  • Host OS : Windows 10
  • Guest OS: Linux Mint

Important Configuration Steps:

  1. The Host Virtualbox need host-only networking enabled
  2. The Guest OS needs a second network adaptor enabled, for host-only networking
  3. The Guest OS needs MANUAL IP configuration
  4. The Host OS needs a Firewall adjustment

1. Enable Host-Only networking in Virtualbox

Under Global Tools, or File > Host Networking Manager, or Ctrl-W:

image

I use the default base address range of 192.168.56.x and the host IP of .1.

I also set up the DHCP server to dispatch addresses in the 101-254 range.

2. Add a second adaptor to the Guest OS

By default the network adaptor in the Guest is NAT. We want to leave this one unchanged, and click on the Adapter 2 tab:

image

Enable the adapter and select “Host-only adapter”.

3. Manually configure the Guest OS IP settings on the new adaptor

At this point we should be able to ping the host:

colin@golem:~$ ping 192.168.56.1
PING 192.168.56.1 (192.168.56.1) 56(84) bytes of data.
64 bytes from 192.168.56.1: icmp_seq=1 ttl=128 time=0.254 ms
64 bytes from 192.168.56.1: icmp_seq=2 ttl=128 time=0.762 ms
64 bytes from 192.168.56.1: icmp_seq=3 ttl=128 time=0.655 ms
^C
--- 192.168.56.1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2047ms
rtt min/avg/max/mdev = 0.254/0.557/0.762/0.218 ms

And, indeed we can. Even though the network will be available, the adaptor will probably have obtained an IP from the DHCP server, e.g. 192.168.56.101.

Go to the Network Connections dialog and edit the settings on the second network connection. (I usually rename my Ethernet connections so as to clearly identify which is WAN and which is LAN):

image

For the “Host Network” adaptor, under IPV4 Settings, configure a manual static IP address.

I use 192.168.56.56 because it is easy to remember.

image

At this point, you may need to restart the networking subsystem to respect your changes. I usually just log out and re-login to the Guest desktop at this point.

Check the IP settings using ifconfig :

colin@golem:~$ ifconfig
enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.2.15 netmask 255.255.255.0 broadcast 10.0.2.255
...

enp0s8: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.56.56 netmask 255.255.255.0 broadcast 192.168.56.255
...

Good, looks like it is respecting the static IP we want.

At this point, in the Host OS, try pinging 192.168.56.56. It might work fine… in which case you can stop here. But if you encounter problems communicating with the Guest, you may need to relax the Host firewall:

4. Adjust the Host’s firewall

Seeing as this network only exists virtually on my computer, I want to take the simple way out and just tell my host firewall to get out of the way of this specific network.

In Windows 10, bring up the Windows Defender Firewall and select “Advanced Settings” from the left panel.

image

Select “Windows Defender Firewall Properties”

Select Private Profile and Public Profile tabs, and select “Customize”

Uncheck “VirtualBox Host-Only Network”

At this point, you should be able to communicate between Host and Guest in either direction, using any protocol you want.

Using recursive Common Table Expressions to generate lists

Say we want create a generated list of values – sequential dates, for example – as a set of rows. We can do this using recursive common table expressions.

Here’s an example of a common table expression to generate a set of dates:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select date '2012-01-02' as DATE_VALUE from dual
union all
select date '2012-01-03' as DATE_VALUE from dual
)
select * from dates;

Using this technique, we could create a set of rows with sequential date values… but it is not very flexible. Ideally, we want a technique that expresses the intent algorithmically.

In 11g R2 and later, we can use a reference to the outer table recursively from inside the definition! This sounds crazy but it works:

with dates as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

In this example, we are UNIONING the first row with a sub-select against itself, adding a new row that increments the date value by 1.

The first time I tried this technique, I got the following error:

ORA-32039: recursive WITH clause must have column alias list

Some research on the web gave me a page that explained what the problem was. The error message is actually very clear: we need to add a list of columns at the top of the WITH <alias> clause:

with dates ( DATE_VALUE ) as
(
select date '2012-01-01' as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '1' day as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

And now, it just works:

SQL>
DATE_VALUE
-----------
2012-01-01
2012-01-02
2012-01-03
2012-01-04
2012-01-05
2012-01-06
2012-01-07
2012-01-08
2012-01-09
2012-01-10
2012-01-11
:

Aside: Here’s how to do the same query in PostgreSQL:

with recursive dates ( DATE_VALUE ) as
(
select date '2012-01-01' as DATE_VALUE
union all
select cast(d.DATE_VALUE + interval '1' day as date) as DATE_VALUE
from dates d
where d.DATE_VALUE < date '2013-01-01'
)
select * from dates;

The nice thing about this technique is that we can control the end date, and the increment size, by variables:

define _increment = '5';
define _start = date '2012-01-01';
define _end = date '2012-02-01';

with dates ( DATE_VALUE ) as
(
select &_start as DATE_VALUE from dual
union all
select d.DATE_VALUE + interval '&_increment' day as DATE_VALUE
from dates d
where d.DATE_VALUE < &_end
)
select * from dates;

This gives us a list of dates 5 days apart.

Using this technique in a PL/SQL block

If you’re in a PL/SQL block, trying to generate a list dynamically based on parameter values, the code will need to look slightly different:

declare
p_start_time date;
p_interval_days integer;
p_end_date date;
--
l_interval interval day to second;
begin
l_interval := numtodsinterval( p_interval_days, 'day' );
insert into My_List_Of_Dates
with
dates ( DATE_VALUE ) as
(
select
p_start_time as DATE_VALUE
from dual
union all
select
DATE_VALUE + l_interval as DATE_VALUE
from dates d
where d.END_TIME < p_end_date
)
select DATE_VALUE from dates;
end;

« Older posts Newer posts »

© 2024 More Than Four

Theme by Anders NorenUp ↑