Even more BASIC: PC-BASIC on the Mac

▼ After my BASICODE adventures a few days ago, I really wanted to see if I could run a BASICODE game on my VT420 terminal. I found a promising game: zeeslag (battleship) from Best of BASICODE 6.

Porting it to Python turned to be too large of a project, not helped by the many loops and multidimensional arrays. So I went looking for a BASIC interpreter for on the Mac, and found PC-BASIC, a cross-platform GW-BASIC emulator.

As luck would have it, ZEESLAG.BAS contains the BASICODE subroutines for GW-BASIC, so the program runs as-is. The only issue is that the program runs way too fast. You'll want to add a couple of zeros to the delay loops on lines 20160 and 20560. (Delay loops? Yes...)

The next step was to see if I could run the whole thing from the command line on the terminal. PC-BASIC does let you run it from the terminal:

/Applications/PC-BASIC.app/Contents/MacOS/pcbasic -t

But... only if your terminal settings are recognized. A quick export LC_ALL=C does wonders here. However, the program didn't recognize the terminal settings for the VT420, and I couldn't find any setting that worked. Until I realized I could run the program in screen, and that worked:

export LC_ALL=C
export TERM=vt420
screen
/Applications/PC-BASIC.app/Contents/MacOS/pcbasic -t
load "zeeslag"
list 20160
(use the cursor keys to go to the 400 value, change it, press enter)
list 20560
(use the cursor keys to go to the 1000 value, change it, press enter)
run

So it can be done. That should be enough nostalgia for a while.

Permalink - posted 2020-07-06

BASICODE: software distribution by radio broadcast in the 1980s

▼ In 2009, I started an effort to digitize all my cassette tapes. As my last computer that still has a line in port is facing retirement, I decided to finally finish that project. Perhaps more about this later. Turns out some of these old cassettes have weird things on them, including radio broadcasts that contain computer programs.

Back in the 1980s, home computers didn’t come with any storage. But a Commodore 64 floppy drive cost the same as the computer itself here in Europe. So it was common to use a cheap cassette drive to store programs and data. You could of course buy commercial software and/or exchange copies with friends. But without a modem, which didn’t appear until around 1990, there was no good way to exchange data with larger likeminded groups. It also didn’t help that there were many different home computers and they were all different.

Both these problems were addressed with BASICODE. This was a lowest common denominator subset of the BASIC programming language that all home computers came with. For the essential functions missing from the common BASIC subset, BASICODE provided a set of standardized subroutines. So if you wanted to run BASICODE programs, all you had to do is make the subroutines for clearing the screen, setting the cursor position, et cetera for your specific computer, and you could run all BASICODE programs.

But you still had to get these programs. Solution: a standardized cassette data format. BASICODE for a certain computer model would typically be able to read and write BASICODE from/to cassette. There were also little adapters that plugged into the serial port. These days, we can use the program minimodem to decode BASICODE from WAV files. I pulled pvtmert/minimodem from Docker and used the following to decode my recordings:

docker run --rm -ti -v /Volumes/video/rips:/rips pvtmert/minimodem minimodem -f /rips/basicode.wav --rx -q -S 1200 -M 2400 --stopbits 2 1200 |LC_ALL=C tr '200-377' '000-177' >basicode.txt

(The tr command strips the high bit that is set to 1 in the BASICODE protocol.)

In addition to broadcasting a selection of various user-submitted BASICODE programs, the radio program Hobbyscoop also broadcast a weekly newsletter in the form of a BASICODE program. Turns out I have a recording of the 250th one, from 1989. This is how the program came off of the tape.

After cleaning it up wanted to run the program on my C64 Mini, but for the life of me I couldn’t find the C64 BASICODE subroutines. So I made my own. Get the .prg file here if you want to try for yourself. Or try it here with a Javascript BASICODE interpreter.

Then I decided to see how hard it would be to make BASICODE run in Python. Have a look at the resulting code here. Turns out that with some small syntax differences a lot of BASIC statements are basically the same in Python, and it’s easy enough to implement most missing ones with a few lines of code. The big difference is that in BASIC, you mostly need to structure programs yourself using GOTO statements, while modern languages like Python are much more structured and don’t have GOTO. Also, in BASIC all variables are global. So the porting is easy, but not entirely trivial.

The hardest part was getting reading the cursor position to work properly. In xterm you do this by sending an ANSI escape sequence to the terminal, and then you get one back that you read from standard in. Strangely, this was also the hardest part on the Commodore 64, where I eventually had to call a KERNAL (system) routine to do this.

Permalink - posted 2020-07-04 - 🇳🇱 Nederlandse versie

Transitioning the Mac from Intel to ARM CPUs

▼ There have been rumors that Apple will transition the Mac from Intel CPUs to ARM CPUs designed by Apple itself have been around for some years, and now they've come to a head: apparently, Apple will announce the transition at their WWDC conference a week and a half from now.

CPU transitions are nothing new for Apple. Back in 1984, the original Mac had a Motorola 68000 CPU. This was an advanced CPU at the time, running 32-bit software when cheap computers still used 8-bit 6502 or Z80 CPUs and PCs used 16-bit 8086 CPUs. Still, ten years later the 68000 family had run out of steam and Apple transitioned the Mac to PowerPC CPUs made by Motorola and later IBM.

However, PowerPC CPUs use a completely different instruction set than the one used by the 68K series. Apple largely solved that issue by including an emulator in the system that lets the PowerPC CPU execute 68000 code. Of course new software would be written for the PowerPC CPU. But apparently a good amount of 68000 code persisted for years in classic MacOS. A decade later, it became apparent that the PowerPC was no longer competitive with the Intel and AMD CPUs used in PCs. So in 2005, Apple announced that it would transition to Intel x86 CPUs. (Descendants of the 8086.) Again, Apple provided a way to keep old code running, this time using the Rosetta "dynamic binary translator". However, Rosetta was only used for PowerPC applications, not for parts of the system. These were all 100% x86.

Applications built using Apple's (still new) Xcode development environment would ship as "fat binaries" that contain both PowerPC and x86 code, so the app would run at maximum speed on both types of Macs. In fact, the last PowerPC G5 CPUs and the Core 2 Duo Intel x86 CPUs that Apple started shipping in 2007 were 64-bit CPUs. So a fat binary could even contain 32-bit PowerPC, 64-bit PowerPC, 32-bit x86 and 64-bit x86 code.

The transition from PowerPC to x86 was more complex than the one from 68K to PowerPC because 68K and PowerPC are both "big-endian" while x86 is "little-endian". What that means is that when the CPU stores the 16-bit binary value 11111111000000 in memory at location 3500, in big-endian, location 2000 holds 11111111 and 3501 holds 00000000. In little-endian, the lower part of the number comes first, so 00000000 in 3500 and 11111111 in 3501.

Shortly after announcing the transition to Intel CPUs, Apple allowed developers to buy/rent a development machine with an Intel CPU to test their applications. After the transition, they had to return that system to Apple.

So now, a decade and a half later, it looks like Apple is doing it again.

There are of course many questions. One is whether Apple will transition the entire Mac line, or maybe use ARM for the laptops and keep using Intel for the high end desktops, most notably the Mac Pro. Apple is leading the industry with its high performance ARM CPUs that it uses in the iPhone and iPad. In benchmarks such as Geekbench they often out-perform Intel CPUs, especially for single core tasks. However, I'm not sure how reliable Geekbench is. Also, running very fast for a few seconds is one thing, but doing the same for many minutes is very different and requires a good cooling system.

Also, does it make sense for Apple to compete with Intel and AMD for the highest performance CPUs? Apple doesn't sell too many of these high end systems, so it would be hard to recoup any investments. Still, the rumors indicate that the whole line will transition. But I still expect the non-pro laptops to adopt ARM first and the pro desktops to be the last.

Another big question is whether there will be some kind of emulation or translation this time, so existing x86 apps can run on the new ARM Macs. On the one hand, Apple is a forward-looking company that doesn't like bending over backwards to maintain the status quo. But they're also pragmatic if they need to be. For instance, they only shipped 32-bit Intel CPUs for about a year, but they didn't wait for 64-bit ones and subsequently had to support 32-bit CPUs for a good number of years.

All in all, I think they'll skip emulation and translation. They killed off 32-bit applications last year with MacOS 10.15 Catalina. There's probably still a few applications that are built with other tools than Apple's Xcode, but those must be very rare. Obviously Xcode will make it as easy as possible to create a fat binary with 64-bit x86 and 64-bit ARM code. So any application under active development, which of course includes Apple's as well as the big ones from Microsoft and Adobe that are sold as subscriptions, should be available as fat binaries that run on ARM Macs quickly, if not immediately. And that's probably enough for a good swath of Mac users.

I do wonder how many changes are necessary for existing apps to compile for and run on ARM. As ARM is little-endian like x86, and hand-crafted assembly code that was still common in the 1990s is barely used anymore, I believe "porting" applications to ARM will be very easy.

But what about testing? Do developers need an actual ARM system running MacOS to test their applications, or will the ARM versions behave the same as the x86 version so separate testing on ARM is unnecessary? (Obviously it helps to have an actual ARM system to test ARM-specific features and optimize performance.) Apple could sell/rent development machines like in 2005, but that seems difficult, with so many developers and all the current logistics/economics issues. I'm sure many people within Apple won't like the idea of using an iPad with a keyboard and a mouse or trackpad as an ARM development system. Then again, those iPads are already out there and can go back to being regular iPads after the transition is over, so it's a very pragmatic solution. Or perhaps emulation will work in the opposite direction this time around: a very precise ARM emulator so developers can test their applications without needing ARM hardware.

It's all very interesting, but at the same time I'm in no hurry to upgrade to an ARM Mac. I just hope that this transition will end the relentless push to change all kinds of things in MacOS and allow some time for consolidation and stability. With the Intel transition, I waited until 2007 and was pretty happy with the MacBook Pro I got then. So I'd be surprised if I ended up with an ARM Mac before mid-2022.

Permalink - posted 2020-06-12

Archiving data: storage prices in 2013 and 2020

▼ Back in 2013, I wrote a blog post about archiving. In it, I compared the costs per terabyte (and the weight per terabyte) of several ways to store data for archival purposes. When I read Beyond Time Machine: 5 Archiving over at The Eclectic Light Company blog, I realized that it’s time revisit this topic. This is the list of storage options with the price per gigabyte in euros, back in 2013 and now in 2020:

€/TB 2013 €/TB 2020 Sweet spot

DVD±R 45 67

BR-R 45

2.5" USB HDD 60 23 4 - 10 TB

3.5" internal HDD 35 32 4 - 8 TB

USB flash 370 195 128 - 256 GB

SD card 520 320 128 GB

Internal SSD 125 1 TB

USB SSD 170 1 TB

And just to see how ridiculous things can get, if you buy an Apple computer, SSD upgrades can cost as much as € 1000 per terabyte (€ 250 to go from 256 to 512 GB).

The sweet spot is the size where the cost per terabyte is the lowest. If you go smaller or larger, you pay more for the same amount of storage. I got the 2020 prices by looking at bol.com. Obviously, prices vary, sometimes significantly. I chose the lowest prices, except if only a couple of no-brand options were the cheapest. Note that inflation was around 7% between 2013 and 2020.

What’s interesting here:

DVD±Rs have gone up in price
Blu-ray recordable is now at about the per-TB price DVD±R used to be
3.5" HDDs haven’t really gone down in price per TB
2.5" USB HDDs are now about three times cheaper than they used to be
2.5" USB HDDs are now a good deal cheaper than 3.5" HDDs
SSDs are still much more expensive than HDDs

2.5" USB HDDs are by far the cheapest way to archive your data. They’re also convenient: they’re small in size, and unlike 3.5" USB HDDs (which are extinct now, I think) they are bus powered so no issues with power supplies. Yes, copying terabytes worth of data to a USB HDD takes a lot of time. But you can do that overnight. With BD-Rs you need to put in a new one for every 25 to 100 GB.

I can still read most (but not all!) of the DVDs I burned 15 years ago without trouble, but I wouldn’t bet any money that a 15-year-old USB HDD will still work. You really have to keep your archived data on at least two of those and then replace them every three years or so, copying your data from the old one to a new one. (Or, more likely in my case: from my NAS to a new HDD.) But that’s a lot more doable than duplicating BD-Rs. Also, any computer can read USB HDDs, while for DVDs and blu-rays you need a drive, and those are much less common than they used to be, and that trend is sure to continue.

Permalink - posted 2020-04-19

The one perfect programming language

▼ There's an episode of the TV show Friends where Chrissie Hynde has a guest role. Phoebe feels threatened by her guitar playing, and asks her "how many chords do you know?" "All of them."

Wouldn't it be cool if you could give the same answer when someone asks "how many programming languages do you know?"

But maybe that's a bit ambitious. So if you have to choose, which program language or programming languages do you learn? I got started with BASIC, 6502 assembly, Forth and Pascal. Those are now all obsolete and/or too niche. These are other languages that I'm familiar with that are relevant today:

C
Javascript
Python
PHP

I'd say that either C or Javascript is the best choice as a first language to learn. Javascript has the advantage that you only need a web browser and a text editor to get started, and you can start doing fun and useful things immediately. However, Javascript's object orientation and heavy use of events makes it hard to fully understand for someone new to programming. So it's probably better to dip a toe in with Javascript and after a while start learning another language to get a better grasp of the more advanced fundamentals.

C is the opposite of Javascript. It's certainly not very beginner friendly, not in the least because it requires a compiler and a fair bit of setup before you can start doing anything. And then you get programs that run from the command line. It's much harder to do something fun or useful in C. However, what's great about C is that it's relatively simple and low level, which means that it's an excellent way to learn more about the way computers and data structures actually work. Because it's a simple language, it's a reasonable goal to learn the entire language. That's especially important when reading other people's code. Also, many other languages such as Java, Javascript and PHP are heavily influenced by C, so knowing C will help you understand other languages better.

If you want to be able to be productive as a programmer and you could only use one language, Python is probably the one. It's used for many different things and has some really nice features to help you start going quickly. But it also has many of its own quirks and complexity hides just below the surface, so like with Javascript, I would use Python as a "dipping your toe in" language and if you want to learn more, switch to something else. A big advantage of Python over C is that you don't need a compiler, but it still (mostly) lives on the command line.

PHP is the language that I've used the most over the last 20+ years. If that hadn't been the case, I'm not sure it would have been on this list. It's not held in very high regard in many circles, so if you want something that looks good on your CV, PHP is not a top choice. Then again, it works very well for web backends, and has an incredible amount of stuff built in, allowing you to be productive quickly. It's also close to C in many ways, so that helps if you already know C. But like Javascript and Python it's a dynamic language, so it takes a lot less work to get things done than in C.

Of course a lot depends on what you want to do. For stuff running in a browser, Javascript is the only choice. For low level stuff, C is the best choice, although Python could work in some cases, too. I think for web backends, PHP is the best fit, but Python can certainly also do that. For developing mobile apps, you need Swift or Objective C. For Android, Java or Kotlin. Mac apps are also generally in Objective C, with Swift (Apple's relatively new language) becoming more common. On Windows, a lot of stuff is written in C#. A lot of lower-level stuff, especially graphics, is done in C++. So these are all very useful languages, but I wouldn't recommend any of them as a first language.

So let's have a look at a simple program in each of those languages, and then see how fast they run that same program. For a given input, the program calculates what numbers that input is divisible by. (It's not optimized in any way and there is no error checking, so it's not an example of good code.)

#include <stdio.h>
#include <stdlib.h>
 
int main(int argc, char *argv[])
{
  int n, i;
 
n = atoi(argv[1]);
 
if (n % 2 == 0)
  printf("%d\n", 2);
 
for (i = 3; i < n; i += 2)
  if (n % i == 0)
    printf("%d\n", i);
}

PHP:

<?php
  
$n = $argv[1];
 
if ($n % 2 == 0)
    printf("%d\n", 2);
 
for ($i = 3; $i < $n; $i += 2)
  if ($n % $i == 0)
    printf("%d\n", $i);

Javascript:

<script lanugage=javascript>
var ts1, ts2, n, i;
 
ts1 = new(Date);
 
n = 444666777;
 
if (n % 2 == 0)
  document.write(2 + "<br>\n");
 
for (i = 3; i < n; i += 2)
  if (n % i == 0)
    document.write(i + "<br>\n");
 
ts2 = new(Date);
document.write("Time: " + 
  (ts2.getTime() - ts1.getTime()) / 1000 + 
  " seconds<br>\n");
 
</script>

Python:

import sys
import math
 
n = int(sys.argv[1])
 
if (n % 2 == 0):
    print(2)
 
for i in range(3, n, 2):
    if (n % i == 0):
        print(i)

(For the Javascript version I hardcoded 444666777 as the input, for the others the input is read from the command line.)

Common wisdom is that compiled languages like C are faster than interpreted languages (the others). That turns out to be true, with the C version (compiled with -O3 optimizations) taking 0.7 seconds on my 2013 MacBook Pro with a 2.4 GHz Intel i5 CPU.

But interestingly, the Javascript is barely any slower at just over 1 second. This shows just how much effort the browser makers have poured into making Javascript faster.

The PHP version, on the other hand, takes more than 21 seconds. The Python version 50 seconds. Weirdly, 15 of those seconds were spent running system code. This is because running the Python program uses up 6 GB of memory on my 8 GB system, so the system has to do all kinds of things to make that work.

It turns out that having a for loop with the range function is problematic. It looks like range first creates the requested range of numbers in memory (all 222 million of them!) and then the for loop goes through them. But we can replace the for loop with a while loop:

import sys
import math
 
n = int(sys.argv[1])
 
if (n % 2 == 0):
    print(2)
 
i = 3
while (i < n):
    if (n % i == 0):
        print(i)
    i = i + 2;

This does the same thing, but in a way that's more like the for loops in the other languages. This version takes 36 seconds, and, more importantly, there are no issues with memory use.

C can do these calculations really fast because the overhead of pushing many small instructions to the CPU is small. Each instruction has more overhead in the other languages. With more complex operations, such as manipulating text strings, the C's advantage is a lot less because each operation in the program leads to a much larger number of instructions for the CPU, so the language overhead is a much smaller part of the running time. I haven't been able to think of a nice simple test program to see how big the difference is, though.

Permalink - posted 2020-04-01

	€/TB 2013	€/TB 2020	Sweet spot
DVD±R	45	67
BR-R		45
2.5" USB HDD	60	23	4 - 10 TB
3.5" internal HDD	35	32	4 - 8 TB
USB flash	370	195	128 - 256 GB
SD card	520	320	128 GB
Internal SSD		125	1 TB
USB SSD		170	1 TB