A Terminal Case of Linux

[ comments ]

Has this ever happened to you?

You want to look at a JSON file in your terminal, so you pipe it into jq so you can look at it with colors and stuff.

Cool bear's hot tip

That's a useless use of cat.

...oh hey cool bear. No warm-up today huh.

Sure, fine, okay, I'll read the darn man page for jq... okay it takes a "filter" and then some files. And the filter we want is.. . which, just like files, means "the current thing":

There! Now you have pretty colors! But say your JSON file is actually quite large, and it doesn't fit in your terminal window, so you want to use a pager, maybe something like less.

Well, showing a less invocation is actually quite annoying, so instead we'll just use cat instead.

Again?

Yes, bear, again. I know how you feel about cats, in fact you're closer to dogs, but please bea... please let me continue.

Now the pretty colors are all gone!

Of course, we can force jq to output colors anyway, with -C (short for --color-output):

But something is afoot.

And it's not just jq! ls even starts separating items with newlines:

We can "fix" it with --color=always, but yeah. There's definitely some detection going on there too.

Also, if we pipe ls --color=always to less, we see strange markings!

If we want to see color in less, we need to use the -R (short for --RAW-CONTROL-CHARS — yes, really).

In fact... let me try something:

AhAH! We can save colors to a file and print them later - and our friend xxd the hex dumper shows us that the colors are really part of the output.

What did we learn?

So far, we know three things. On Linux,

  • Colors are part of the output, they're in-band
  • Some programs stop outputting color when their output is redirected
  • There's usually a way to force them to output color anyway, but it's a per-program setting (there's no standard)

So far we've been running commands from zsh, because that's the shell I dislike the least right now.

But what if we execute commands from another program? Like, a Rust program?

Let's try doing that:

Shell session
$ cargo new terminus
     Created binary (application) `terminus` package
$ cd terminus/
Rust code
// in `terminus/src/main.rs`
use std::{error::Error, process::Command};
fn main() -> Result<(), Box<dyn Error>> {
    let out = {
        Command::new("/bin/ls")
            .arg("--color=auto")
            .output()
            .map(|s| String::from_utf8(s.stdout))??
    };
    println!("{}", out);
    Ok(())
}

What, no fancy crates today?

No! I would normally pull in color-eyre, tracing and tracing-subscriber, but today we're on a diet.

Now, if we run this, we can see...

No colors.

What? I see green.

That's just cargo's output — we could suppress it with --quiet (or -q for short) .

Ahhh, the output of ls doesn't have colors. And it should have like, blue for src and target, given that they're directories.

Exactly!

So! ls knows its output is being redirected somewhere, and it doesn't print colors. Even when we execute it from a Rust program.

But how? We could look at the source code for ls. That would be fun — I've never done that!

Luckily, there's a GitHub mirror for coreutils, so it's not too hard to find.

In ls.c, in the the decode_switches function, we can find a switch inside of a while(true) loop, that seems to process command-line arguments.

Here for example, it enables "human-friendly output":

C code
// in `coreutils/src/ls.c`
        case 'h':
          file_human_output_opts = human_output_opts =
            human_autoscale | human_SI | human_base_1024;
          file_output_block_size = output_block_size = 1;
          break;

And.. here's the --color switch:

C code
// in `coreutils/src/ls.c`
       case COLOR_OPTION:
          {
            int i;
            if (optarg)
              i = XARGMATCH ("--color", optarg, when_args, when_types);
            else
              /* Using --color with no argument is equivalent to using
                 --color=always.  */
              i = when_always;
            print_with_color = (i == when_always
                                || (i == when_if_tty && stdout_isatty ()));
            break;
          }

Ah! print_with_color is set to a truthy value if:

  • --color=always is passed (we knew that), or
  • --color=auto is passed and stdout_isatty() returns true

And here's the code for stdout_isatty():

C code
// in `coreutils/src/ls.c`
/* Return true if standard output is a tty, caching the result.  */
static bool
stdout_isatty (void)
{
  static signed char out_tty = -1;
  if (out_tty < 0)
    out_tty = isatty (STDOUT_FILENO);
  assume (out_tty == 0 || out_tty == 1);
  return out_tty;
}
Cool bear's hot tip

Look at that, it's even doing memoization!

Initially, the value of out_tty is -1, but it persists across function calls (because it's static), so after the first call returns, it'll be 0 or 1, depending what isatty returns.

Mhh, isatty, we can probably call that from Rust, right?

Feels like a libc function... ah yep here's a man page:

DESCRIPTION

These functions operate on file descriptors for terminal type devices.

The isatty() function determines if the file descriptor fd refers to a valid terminal type device.

The ttyname() function gets the related device name of a file descriptor for which isatty() is true.

The ttyname() function returns the name stored in a static buffer which will be overwritten on subsequent calls. The ttyname_r() function takes a buffer and length as arguments to avoid this problem.

Okay okay, how do we call libc functions... uhh well we know the Rust standard library relies on libc for a bunch of things, so surely we already link against it...

Shell session
$ ldd ./target/debug/terminus
        linux-vdso.so.1 (0x00007ffd7cf69000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f4e15280000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f4e1525d000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f4e15257000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f4e15065000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f4e15302000)

Yes we do! That's libc.so.6 in the list, which should have isatty...

Shell session
$ nm -D /lib/x86_64-linux-gnu/libc.so.6 | grep -E 'isatty|ttyname'
00000000001324d0 T __ttyname_r_chk
0000000000112cc0 W isatty
0000000000112580 T ttyname
0000000000112900 W ttyname_r

Yes! Although, uh, isatty is a weak symbol.

Anyway. If we define isatty in an extern "C" block, we should be able to call it.

Call it but... with what?

Well it takes an int fd, which is a file descriptor. Typically, we have:

  • 0 for stdin (standard input)
  • 1 for stdout (standard output)
  • 2 for stderr (standard error)

Ah, we can try 1 then!

Rust code
// in `terminus/src/main.rs`
use std::{error::Error, os::raw::c_int};
extern "C" {
    fn isatty(fd: c_int) -> c_int;
}
const STDOUT: c_int = 1;
fn main() -> Result<(), Box<dyn Error>> {
    let stdout_is_tty = unsafe { isatty(STDOUT) } == 1;
    dbg!(stdout_is_tty);
    Ok(())
}
Shell session
$ cargo build --quiet
$ ./target/debug/terminus
[src/main.rs:11] stdout_is_tty = true
$ ./target/debug/terminus | cat
[src/main.rs:11] stdout_is_tty = false

So that's how they do it!

Okay so there's two ways we can go from here: we can go lower-level, or we can go higher-level.

Let's first dig down: how does isatty even work. We give it a file descriptor, literally just "the integer value one", and it tells us if it's a terminal or not.

But libc is just, you know, a big flaming ball of C code. There's no reason for it to have any extra powers — there's programs that don't use libc at all, and they're still able to tell whether or not a file descriptor is a terminal.

Which is to say, libc is not in charge of file descriptors. The kernel is.

And we know one way userland applications (like ours) can talk to the Linux kernel is by performing a syscall.

So... is it making a syscall? Let's ask our friend strace the... ostrich? (No, for real, click that link).

We'll change our main function to look like this:

Rust code
// in `terminus/src/main.rs`
fn main() -> Result<(), Box<dyn Error>> {
    println!("calling isatty...");
    let stdout_is_tty = unsafe { isatty(STDOUT) } == 1;
    println!("calling isatty... done!");
    dbg!(stdout_is_tty);
    Ok(())
}

...just so it's easier to pinpoint the moment we call isatty:

Shell session
$ cargo build --quiet
$ strace -o /tmp/strace.log -- ./target/debug/terminus
calling isatty...
calling isatty... done!
[src/main.rs:13] stdout_is_tty = true
$ cat /tmp/strace.log | grep 'calling isatty... done' -B 2
write(1, "calling isatty...\n", 18)     = 18
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(1, "calling isatty... done!\n", 24) = 24

That's another useless use of cat!

AhhhHHHHH FINE

Shell session
$ grep 'calling isatty... done' -B 2 /tmp/strace.log
write(1, "calling isatty...\n", 18)     = 18
ioctl(1, TCGETS, {B38400 opost isig icanon echo ...}) = 0
write(1, "calling isatty... done!\n", 24) = 24

The original output is a lot so I used a well-placed grep to whittle it down to something reasonable.

Okay, so it is making a syscall. Our println! and dbg! macros end up being write syscalls, which take three arguments: the file descriptor, the data, and the length.

Oooh it's writing to file descriptor 1! That's the standard output!

Correct!

And it's doing a syscall to ioctl! Also passing file descriptor 1.

Which returns... 0. That's kernel for great success!

However, if we redirect terminus's output:

Shell session
$ strace -o /tmp/strace.log -- ./target/debug/terminus > /dev/null
[src/main.rs:13] stdout_is_tty = false
$ grep 'calling isatty... done' -B 2 /tmp/strace.log
write(1, "calling isatty...\n", 18)     = 18
ioctl(1, TCGETS, 0x7ffe0d358c30)        = -1 ENOTTY (Inappropriate ioctl for device)
write(1, "calling isatty... done!\n", 24) = 24

Then it returns -1, which strace helpfully translates to ENOTTY, ie. "error not a TTY".

So that means... we can make that syscall ourselves!

Yeah! Who needs libc? Not us!

We will, however, need the unstable asm feature.

No problem! We can switch to the nightly channel:

TOML markup
# in `terminus/rust-toolchain.toml`
[toolchain]
channel = "nightly-2021-09-23"

If you're using rustup to manage Rust versions, now, all cargo commands in the terminus/ folder should use that nightly version. Neat!

Cool bear's hot tip

Why pin to a specific nightly instead of just doing channel = "nightly" (which would install the latest nightly)?

Well, in nightly Rust anything can change. And in case you're reading this article from The Future, it's very possible that your nightly will behave differently.

So, for this article, we're using that specific nightly. Upgrade at your own risk.

Rust code
// in `terminus/src/main.rs`
#![feature(asm)]
use std::error::Error;
fn main() -> Result<(), Box<dyn Error>> {
    // on Linux x86_64, everything is an `u64`.
    const STDOUT: u64 = 1;
    // found in linux/source/include/uapi/asm-generic/ioctls.h
    const TCGETS: u64 = 0x5401;
    // let's go ahead and guess that whatever "TCGETS" is getting
    // doesn't take more than 32KiB.
    const GENEROUS_BUFFER_SIZE: usize = 32 * 1024;
    let mut mysterious_buffer = vec![0u8; GENEROUS_BUFFER_SIZE];
    // okay, here we gooo
    let ret = unsafe { ioctl(STDOUT, TCGETS, mysterious_buffer.as_mut_ptr()) };
    // phew, we made it.
    dbg!(ret);
    Ok(())
}
unsafe fn ioctl(fd: u64, cmd: u64, arg: *mut u8) -> i64 {
    let syscall_number: u64 = 16;
    let ret: u64;
    asm!(
        "syscall",
        inout("rax") syscall_number => ret,
        in("rdi") fd,
        in("rsi") cmd,
        in("rdx") arg,
        // those aren't used, but they may be clobbered by
        // the syscall, so we need to let LLVM know.
        lateout("rcx") _, lateout("r11") _,
        options(nostack)
    );
    // errors are negative, so this is actually an i64
    ret as i64
}

Let's give it a shot!

Shell session
$ cargo run --quiet
[src/main.rs:21] ret = 0
$ cargo run --quiet | cat
[src/main.rs:21] ret = -25

And error 25 is...

C code
// in `linux/source/include/uapi/asm-generic/errno-base.h`
#define	ENOTTY 25	/* Not a typewriter */

Hurray!

HECK YEAH NO LIBC

Well I mean... we still depend on, like...

Shell session
$ nm ./target/debug/terminus | grep "U " | grep GLIBC | wc -l
56

...at least fifty-six functions from libc.

Well okay sure but we made our own isatty! Go us!

And thus concludes our ride down into the lower level of abstractions.

Now we know!

What did we learn?

If you really, really don't want to use libc, you don't have to.

Linux kernel syscall numbers are stable (here's a nice table of them) and so are the constants, so you don't have to go through libc.

The situation is different on other mainstream OSes. For example, Go used to do "raw system calls" on macOS, but Go programs often broke with new kernel versions. As of Go 1.16, they've switched to libc.

Let's return to a place where we don't actually need to make our own syscalls, shall we? There's a couple others I'd like to try out.

So let's switch back to stable:

Shell session
$ rm rust-toolchain.toml

And then just use the libc crate!

Shell session
$ cargo add libc
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding libc v0.2.102 to dependencies
Rust code
// in `terminus/src/main.rs`
use libc::{isatty, STDOUT_FILENO};
fn main() {
    let stdout_is_tty = unsafe { isatty(STDOUT_FILENO) };
    dbg!(stdout_is_tty);
}
Shell session
$ cargo run -q
[src/main.rs:5] stdout_is_tty = 1
$ cargo run -q | cat
[src/main.rs:5] stdout_is_tty = 0

There, street cred be damned.

Since we're on Linux, it's mountains of C all the way down anyway.

Now, where were we? Ah right! The man page for isatty also mentioned ttyname, which looked interesting, because, between you and me, I still don't have the faintest idea what a "TTY" actually is.

Fair enough, let's try it.

Rust code
// in `terminus/src/main.rs`
use libc::{isatty, ttyname, STDOUT_FILENO};
use std::{error::Error, ffi::CStr};
fn main() -> Result<(), Box<dyn Error>> {
    let stdout_is_tty = unsafe { isatty(STDOUT_FILENO) } == 1;
    if stdout_is_tty {
        let tty_name = unsafe { ttyname(STDOUT_FILENO) };
        assert!(!tty_name.is_null());
        let tty_name = unsafe { CStr::from_ptr(tty_name) };
        let tty_name = tty_name.to_str()?;
        println!("stdout is a TTY: {}", tty_name);
    } else {
        println!("stdout is not a TTY");
    }
    Ok(())
}
Shell session
$ cargo run -q
stdout is a TTY: /dev/pts/11
$ cargo run -q | cat
stdout is not a TTY

Ahah! A TTY is... a file?

What can we do with that file? Can we write to it?

Oh my. Yes. Yes we can.

(I didn't type in the upper pane — it was printed when I ran echo in the lower pane.)

Can we read from it?

Uhhhhhhh sorta kinda yes. It looks like it's a free-for-all: whoever reads first gets the.. worm. If I type really slowly "cat" wins the race.

Wait, you're doing all of this from inside tmux right?

That's how you can have several panes like that?

Yeah!

So does that mean... does each pane have its own TTY?

That seems like it makes sense, but let's check:

Right! They certainly each have their own terminals. How many terminals am I even on right now? Assuming they pop in and out of /dev/pts, we should be able to just list them like so:

$ ls -lhA /dev/pts
total 0
crw--w---- 1 root tty  136,  0 Sep 23 11:57 0
crw--w---- 1 amos tty  136,  1 Sep 23 11:57 1
crw--w---- 1 amos tty  136, 10 Sep 24 18:23 10
crw--w---- 1 amos tty  136, 11 Sep 24 18:23 11
crw--w---- 1 amos tty  136, 12 Sep 24 18:15 12
crw--w---- 1 amos tty  136, 13 Sep 24 18:23 13
crw--w---- 1 amos tty  136,  2 Sep 24 18:23 2
crw--w---- 1 amos tty  136,  3 Sep 24 12:46 3
crw--w---- 1 amos tty  136,  4 Sep 24 18:22 4
crw--w---- 1 amos tty  136,  5 Sep 24 13:09 5
crw--w---- 1 amos tty  136,  6 Sep 24 12:50 6
crw--w---- 1 amos tty  136,  7 Sep 24 12:47 7
crw--w---- 1 amos tty  136,  8 Sep 24 13:00 8
crw--w---- 1 amos tty  136,  9 Sep 24 18:22 9
c--------- 1 root root   5,  2 Sep 23 11:57 ptmx

Ah! Fourteen (counting from zero). Some even date from yesterday. How do we know who they belong to?

Well... they're just files, right?

Right.. so if some process has them open... we should have corresponding file descriptors.

Oooh and the kernel keeps track of file descriptors!

Right! And lsof (list open files) can let us know which process are holding which file descriptors.

Let's try it for the current TTY:

Shell session
$ cargo run -q
stdout is a TTY: /dev/pts/13
$ lsof /dev/pts/13
COMMAND     PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
zsh       10614 amos    0u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10614 amos    1u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10614 amos    2u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10614 amos   10u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10661 amos   15u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10661 amos   16u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10661 amos   17u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10667 amos   15u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10667 amos   16u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10667 amos   17u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10669 amos   15u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10669 amos   16u   CHR 136,13      0t0   16 /dev/pts/13
zsh       10669 amos   17u   CHR 136,13      0t0   16 /dev/pts/13
gitstatus 10670 amos   15u   CHR 136,13      0t0   16 /dev/pts/13
gitstatus 10670 amos   16u   CHR 136,13      0t0   16 /dev/pts/13
gitstatus 10670 amos   17u   CHR 136,13      0t0   16 /dev/pts/13
lsof      14867 amos    0u   CHR 136,13      0t0   16 /dev/pts/13
lsof      14867 amos    1u   CHR 136,13      0t0   16 /dev/pts/13
lsof      14867 amos    2u   CHR 136,13      0t0   16 /dev/pts/13

Ha! That's a lot of processes. I guess they all just coordinate together to make that fancy prompt happen. (It's powerlevel10k, by the way).

Oh and lsof even found itself!

Mhh I wonder how lsof works... is it using a fancy kernel interface?

Let's find out!

Shell session
$ strace -o strace.log -- lsof /dev/pts/10 &> /dev/null
$ grep -E 'openat.*/proc' strace.log | head
openat(AT_FDCWD, "/proc/filesystems", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/proc/mounts", O_RDONLY) = 3
openat(AT_FDCWD, "/proc/16057/fdinfo/3", O_RDONLY) = 6
openat(AT_FDCWD, "/proc/locks", O_RDONLY) = 3
openat(AT_FDCWD, "/proc", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = 3
openat(AT_FDCWD, "/proc/1/stat", O_RDONLY) = 6
openat(AT_FDCWD, "/proc/1/maps", O_RDONLY) = -1 EACCES (Permission denied)
openat(AT_FDCWD, "/proc/1/fd", O_RDONLY|O_NONBLOCK|O_CLOEXEC|O_DIRECTORY) = -1 EACCES (Permission denied)
openat(AT_FDCWD, "/proc/45/stat", O_RDONLY) = 6
openat(AT_FDCWD, "/proc/45/maps", O_RDONLY) = -1 EACCES (Permission denied)

Oh. Uh oh.

Shell session
$ grep -E 'openat.*/proc' strace.log | wc -l
1267

Oh. Ooooooh okay. It's just going through procfs, looking for all open file descriptors of all processes.

Well then.

What did we learn?

At least inside of tmux, our TTYs (terminals, née teletype) are PTYs (pseudo-terminals).

Pseudo-terminals live under /dev/pts/ and apparently, anyone can just go ahead and open them. I mean, there's Unix permissions and stuff but still, uh, feels a little unsettling.

Okay. OKAY. That's all well and good. I feel like we're making tangible progress here, but there's something that puzzles me.

Yeah! Something doesn't add up!

You think so too? What thing, exactly?

Well the... we've got one end of the.. when we open a pseudo-terminal, what we have is... and we can write into it... but then how-

So you don't know. You just thought you'd jump in.

I...

Aaaaaanyway.

Yes, bear. Yes, we're only holding half of it.

Because as you may have noticed, when we rudely echoed something into /dev/pts/N, it showed up in the other pane, but zsh didn't try to execute it.

Yes, that! Why didn't it try to execute it?

Well, let's think for a second.

When we type into a terminal, who gets the input?

Well... in this case it's zsh?

Wrong! It absolutely is not zsh. First, it's my Logitech K120 keyboard, which I keep trying to replace but is the only thing that has worked for me for the past decade, that's right mechanical keyboard nerds, your resident deep dive writer is using second-rate office supplies.

And then, well, something something USB-A, a hub, USB-C, presumably some Intel piece of hardware inside my MacBookPro, the driver for the USB host controller, eventually the Windows kernel, and then... the input chain? Who knows how many layers that's got — but eventually we get a "key event", whatever form it takes.

Then it goes through Windows Terminal which, honestly if you're still using cmd.exe well — what are you waiting to switch over?

Of course that's because I have a weird WSL2 setup going on. I tried using Linux as my main OS, I really did.

But my point is: it's not zsh that gets the input. It's Windows Terminal. If I was on Linux, it'd be alacritty or xfce4-terminal or something.

Mhhh. Okay, fair, but surely it then just forwards that input to zsh, right?

Wrong! Because that would not be a TTY. If zsh's standard input was merely a pipe that Windows Terminal wrote to (let's forget there's two different OSes in action for a minute), isatty would definitely return false.

And besides... watch this:

And now this:

Yeah, the output of bat scales depending on the size of the terminal, so what?

Well, how do you do that with a pipe, huh bear? HOW INDEED?

Ohhhhhhhhh.. with... with...

THAT'S RIGHT. With an ioctl syscall. Which we did above.

But the ioctls we want to perform don't work unless the file descriptor is a TTY.

So then...

Then we're back to square one, yes.

And you know what's worse? All that information, like the terminal's size and whatnot, it's out-of-band!

It is?

Yeah! It's not like colors which are like "ooh here's a funny byte and then ASCII numbers and punctuation I guess, 420 shipit" — these are kernel operations on file descriptors, they're not in the stream at all.

But then how do you know wh-

When a terminal is resized? Well HERE COMES THE FUN BIT! It's not via ioctls! Not at all! Because those are a "pull" interface, you just ask for the info when you need it — and you don't want to be constantly requesting it.

(It's presumably "costly", otherwise ls wouldn't memoize/cache it. At the very least it crosses the kernel boundary, because it's a syscall).

So you know how you know? That a terminal is resized?

No, that's what I was just ask-

WELL YOU GET A SIGNAL

THAT'S RIGHT. A SIGNAL. Just like when a process is killed or stopped.

So between in-band information (like ANSI escape codes) and out-of-band information (like signals and ioctls), GOOD LUCK WRITING A TERMINAL EMULATOR.

"EVERYTHING IS A FILE" MY LILY-WHITE BUTT.

Amos please, there's childr-

WELL THEY HAD TO FIND OUT ONE DAY.

Okay, okay, let's take a breather. We actually don't need to care about signals, thankfully. We'll just never ever resize our terminals. That's fine. It's fine. It's okay. We're okay.

I don't really think we need to care about ioctls either, if all we want to do is just pretend we're a terminal - we don't need to get the terminal info, our child process does... just like zsh requests the terminal info set by Windows Terminal.

But don't we need to... set it?

Bear, I don't even know how we make a pseudo-terminal.

No don't... don't cr- here, I'm looking it up right now. If we look for libc functions that have "pty" in them, surely we'll find something...

quietly sobbing

THERE! There's a thing called "openpty".

There... there is?

Yeahhhh! It looks great! It takes a pointer to a master file descriptor, and to a, oh.

What?

Nothing haha. I was saying: it takes a pointer to a primary file descriptor, and a secondary file descriptor, also a name, a "termp" and a "winp".

Please tell me most of these can be null.

looks closer yeah I mean we can try.

Okay. Sure. We'll try.

Let's look at that function closer... gonna be paraphrasing that man page to modernize it a little bit.

The function openpty() attempts to obtain the next available pseudo- terminal from the system (see pty(4)).

Okay...

If it successfully finds one, it subsequently changes the ownership of the secondary device to the real UID of the current process, the group membership to the group "tty" (if such a group exists in the system), the access permissions for reading and writing by the owner, and for writing by the group, and invalidates any current use of the line by calling revoke(2).

Okay, sure, permissions, why not.

If the argument name is not NULL, openpty() copies the pathname of the secondary pty to this area. The caller is responsible for allocating the required space in this array.

Oh HELL no, we're not letting libc write past the end of a buffer. Not today. That'll be null, thank you very much.

If the arguments termp or winp are not NULL, openpty() initializes the termios and window size settings from the structures these arguments point to, respectively.

Ah! Those can be null.

Upon return, the open file descriptors for the primary and secondary side of the pty are returned in the locations pointed to by aprimary and asecondary, respectively.

Very well, let's see what happens then.

Rust code
// in `terminus/src/main.rs`
use libc::{isatty, ttyname};
use std::{error::Error, ffi::CStr};
fn main() -> Result<(), Box<dyn Error>> {
    let mut primary_fd: i32 = 0;
    let mut secondary_fd: i32 = 0;
    println!("Opening pty...");
    unsafe {
        let ret = libc::openpty(
            &mut primary_fd,
            &mut secondary_fd,
            std::ptr::null_mut(),
            std::ptr::null(),
            std::ptr::null(),
        );
        if ret != 0 {
            panic!("Failed to openpty!");
        }
    };
    dbg!(primary_fd, secondary_fd);
    let is_tty = unsafe { isatty(secondary_fd) } == 1;
    if is_tty {
        let tty_name = unsafe { ttyname(secondary_fd) };
        assert!(!tty_name.is_null());
        let tty_name = unsafe { CStr::from_ptr(tty_name) };
        let tty_name = tty_name.to_str()?;
        println!("secondary is a TTY: {}", tty_name);
    } else {
        println!("secondary is not a TTY");
    }
    Ok(())
}
Shell session
$ cargo run --quiet
Opening pty...
[src/main.rs:20] primary_fd = 3
[src/main.rs:20] secondary_fd = 4
secondary is a TTY: /dev/pts/9

Yes.

YES.

WE HAVE A PSEUDO-TERMINAL. AHHHHH

And I guess we hold the primary and the secondary is what our child process gets?

I guess!?!

Ok but how-

Yes yes how do we get another process to use said pseudo-terminal. Well, you'd think you could just pass the file descriptor as stdin/stdout/stderr, right?

Wrong!

Because see, Linux processes have, like, "sessions", and those sessions have "leaders" and also they have a "controlling terminal" and there's definitely a call to "allocate a new session" but the way you change the "controlling terminal" is much much funnier.

Let's look at the code for login_tty, which does both:

C code
// in `glibc/login/login_tty.c`
int
__login_tty (int fd)
{
    __setsid();
#ifdef TIOCSCTTY
    if (__ioctl(fd, TIOCSCTTY, NULL) == -1)
        return (-1);
#else
    {
      /* This might work.  */
      char *fdname = ttyname (fd);
      int newfd;
      if (fdname)
        {
          if (fd != 0)
        _close (0);
          if (fd != 1)
        __close (1);
          if (fd != 2)
        __close (2);
          newfd = __open64 (fdname, O_RDWR);
          __close (newfd);
        }
    }
#endif
    while (__dup2(fd, 0) == -1 && errno == EBUSY)
      ;
    while (__dup2(fd, 1) == -1 && errno == EBUSY)
      ;
    while (__dup2(fd, 2) == -1 && errno == EBUSY)
      ;
    if (fd > 2)
        __close(fd);
    return (0);
}

I... I don't know where to start. On platforms that don't support the TIOCSCTTY ioctl, it closes standard input, output and error and opens the secondary side of the TTY.

Because that makes it the controlling terminal. Because of course.

And then uhh those while loops look horrifying to me but I guess they're extremely standard *nix stuff and I'm really just showing my lack of experience there.

Regardless: yuck.

But you know... as long as it works...

The real question is... when do we do that? When do we call login_tty?

Cool bear's hot tip

If we didn't insist on using Rust's Command abstraction, we could just use forkpty, which does fork and login_tty.

But we do insist. And it's more fun to peek into what's "actually" going on, for some value of "actually".

When we executed a program from Rust, we did something like this:

Rust code
        Command::new("/bin/ls")
            .arg("--color=auto")
            .output()
            .map(|s| String::from_utf8(s.stdout))??

But that nice, high-level API is hiding the terrible truth.

A great number of cursed things happen when spawning a process on Linux.

Well, unless you use posix_spawn. That's the good one.

But the traditional way, which we'll have no choice but to use here, has two important steps.

First, we fork. This creates an "exact" copy of the calling process, except for like, twenty different things (like having a different PID (process ID), a different PPID (parent process ID), etc.)

What's fun about fork is that we're splitting the space-time continuum — technically, it returns twice. Once in the parent process, and once in the child process.

In the child process, it returns 0, and that's how you know it's the child process because, well, since it's an "exact" copy, they're still executing the same code at this point. So you compare against 0, and things diverge from there.

And usually, at that point, you want to move on to exec, which asks the kernel to violently replace the current process (the child process we just fashioned from the ribs of Adam the memory space of the calling process) with whichever ELF file we said to use.

Which is the way you normally run programs. Unless you really have too much time on your hands.

So let's think! We have login_tty, which makes the calling process the leader of a new session, and sets its controlling terminal to whichever file descriptor we passed.

Well. If we call it before fork it's going to mess with our parent process, which is almost definitely not what we want. We don't want to become the leader of our session. We don't want to change our own controlling terminal.

We only want, say, ls to have as controlling terminal, the pseudo-terminal we just created (with openpty).

If we call it after exec, well... well we can't call anything after exec.

Any code from the parent process, that was "copied into" (really, just mapped) the child process, stops existing the second we call exec.

exec never returns, much like honey badger don't care.

So really, that only leaves us one choice.. we must execute login_tty between fork and exec.

Which means we can't use posix_spawn.

So we'll just look into our Rust code where we can do that and...

Rust code
        Command::new("/bin/ls")
            .arg("--color=auto")
            .output()
            .map(|s| String::from_utf8(s.stdout))??

Mhhh.

Mhh?

I don't see fork. Or exec. Or posix_spawn.

Ahhh yes, they're well hidden, in Terrible Truth Land, also known as the Rust standard library.

See, there's a posix_spawn fn there, and...

Rust code
// in `library/std/src/sys/unix/process_unix.rs`
    fn posix_spawn(
        &mut self,
        stdio: &ChildPipes,
        envp: Option<&CStringArray>,
    ) -> io::Result<Option<Process>> {
        use crate::mem::MaybeUninit;
        use crate::sys::{self, cvt_nz};
        if self.get_gid().is_some()
             self.get_uid().is_some()
             (self.env_saw_path() && !self.program_is_path())
             !self.get_closures().is_empty()
             self.get_groups().is_some()
             self.get_create_pidfd()
        {
            return Ok(None);
        }
        // ...
    }

...and the first thing it does is check if we're doing something funky! And if we are, it doesn't actually use posix_spawn.

Funky things like, I don't know, setting a different UID (user ID) or GID (group ID), or... hey, what's that about closures?

furiously going through std So if that's set here then.. THERE! I got it!

What? Command::pre_exec? Good job bear! Now we can fina-

Wait, it's unsafe. Why is it unsafe.

It's uns- ah. Yes. Well, let's review the docs.

Notes and Safety

This closure will be run in the context of the child process after a fork. This primarily means that any modifications made to memory on behalf of this closure will not be visible to the parent process. This is often a very constrained environment where normal operations like malloc, accessing environment variables through std::env or acquiring a mutex are not guaranteed to work (due to other threads perhaps still running when the fork was run).

Ooooh boy. Okay so we won't allocate anyth-

This also means that all resources such as file descriptors and memory-mapped regions got duplicated. It is your responsibility to make sure that the closure does not violate library invariants by making invalid use of these duplicates.

Ah uh

Panicking in the closure is safe only if all the format arguments for the panic message can be safely formatted; this is because although Command calls std::panic::always_abort before calling the pre_exec hook, panic will still try to format the panic message.

Well I-

When this closure is run, aspects such as the stdio file descriptors and working directory have successfully been changed, so output to these locations may not appear where intended.

Okay, okay I got it — stuff gets real weird in there.

We'll just login_tty and get out of here as fast as we can.

Leeeeeeeeeeeet's go:

Rust code
// in `terminus/src/main.rs`
use std::{error::Error, os::unix::prelude::CommandExt, process::Command};
fn openpty() -> (i32, i32) {
    let mut primary_fd: i32 = -1;
    let mut secondary_fd: i32 = -1;
    unsafe {
        let ret = libc::openpty(
            &mut primary_fd,
            &mut secondary_fd,
            std::ptr::null_mut(),
            std::ptr::null(),
            std::ptr::null(),
        );
        if ret != 0 {
            panic!("Failed to openpty!");
        }
    };
    (primary_fd, secondary_fd)
}
fn main() -> Result<(), Box<dyn Error>> {
    let (primary_fd, secondary_fd) = openpty();
    dbg!(primary_fd, secondary_fd);
    let mut cmd = Command::new("/bin/ls");
    cmd.arg("--color=auto");
    unsafe {
        cmd.pre_exec(move  {
            if libc::login_tty(secondary_fd) != 0 {
                panic!("couldn't set the controlling terminal or something");
            }
            Ok(())
        })
    };
    let output = cmd.output().map(out String::from_utf8(out.stdout))??;
    println!("{}", output);
    Ok(())
}
Shell session
$ cargo run --quiet
Opening pty...
[src/main.rs:19] primary_fd = 3
[src/main.rs:19] secondary_fd = 4

Oh that's.

That's perhaps a little too quiet.

What the heck happened here...

Wait wait wait, we're calling output().

So?

How do you think output() works?

Well... it probably has to redirect stdout and stderr to pip-

To pipes, yes precisely. And then it runs the pre-exec closures, one of which calls login_tty, which..

Oh yeahhh! That would definitely override whatever output() is doing.

But then.. then how do we get the output from /bin/ls?

If it's holding the secondary... and we're holding the primary... we need to.. read from it? Potentially maybe?

Okay fine let's try:

Rust code
use std::{
    error::Error,
    fs::File,
    io::Read,
    os::unix::prelude::{CommandExt, FromRawFd},
    process::Command,
};
// omitted: fn openpty()
fn main() -> Result<(), Box<dyn Error>> {
    let (primary_fd, secondary_fd) = openpty();
    dbg!(primary_fd, secondary_fd);
    let mut cmd = Command::new("/bin/ls");
    cmd.arg("--color=auto");
    unsafe {
        cmd.pre_exec(move  {
            if libc::login_tty(secondary_fd) != 0 {
                panic!("couldn't set the controlling terminal or something");
            }
            Ok(())
        })
    };
    let mut child = cmd.spawn()?;
    println!("Opening primary...");
    let mut primary = unsafe { File::from_raw_fd(primary_fd) };
    println!("Reading from primary...");
    let mut buffer = String::new();
    primary.read_to_string(&mut buffer)?;
    println!("All done!");
    println!("{}", buffer);
    child.wait()?;
    Ok(())
}
Shell session
$ cargo run --quiet
[src/main.rs:29] primary_fd = 3
[src/main.rs:29] secondary_fd = 4
Opening primary...
Reading from primary...
^C

Okay that just gets stuck forever. Mh.

Uhhh maybe the terminal... remains open?? What does strace say?

Shell session
$ cargo build --quiet
$ strace ./target/debug/terminus
(cut)
write(1, "Opening primary...\n", 19Opening primary...
)    = 19
write(1, "Reading from primary...\n", 24Reading from primary...
) = 24
read(3, "Cargo.lock  Cargo.toml  \33[0m\33[01", 32) = 32
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=31500, si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
read(3, ";34msrc\33[0m  strace.log  \33[01;34", 32) = 32
read(3, "mtarget\33[0m\r\n", 64)        = 13
read(3, ^C0x56265b70c23d, 51)             = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
strace: Process 31499 detached

Ohhh. OHHH. Yeah! It does! And read_to_string reads until EOF, which never happens.

Cool bear's hot tip

As noted by many, openpty returns an open file descriptor to the primary and the secondary. We passed a copy of the secondary to the child process but didn't close the parent process's copy of it.

If we did close the parent process's copy, then we would see EOF from the primary as soon as the child process exits, and everything past this point in this article would be unnecessary.

However, amos has a use case where he wants to re-use the terminal for spawning several processes in sequence, yet needs to tell apart the output from each individual process, so it makes sense for him. In most other cases though, you can just close the parent's secondary fd and be done with it!

So we need to...

..read from the primary only until the child process exits?

Yeah, that!

Rust code
fn main() -> Result<(), Box<dyn Error>> {
    let (primary_fd, secondary_fd) = openpty();
    dbg!(primary_fd, secondary_fd);
    let mut cmd = Command::new("/bin/ls");
    cmd.arg("--color=auto");
    unsafe {
        cmd.pre_exec(move  {
            if libc::login_tty(secondary_fd) != 0 {
                panic!("couldn't set the controlling terminal or something");
            }
            Ok(())
        })
    };
    let mut child = cmd.spawn()?;
    let mut primary = unsafe { File::from_raw_fd(primary_fd) };
    let mut out = vec![];
    let mut buffer = vec![0u8; 1024];
    loop {
        let n = primary.read(&mut buffer)?;
        out.extend_from_slice(&buffer[..n]);
        println!("Read {} bytes...", n);
        if child.try_wait()?.is_some() {
            break;
        }
    }
    println!("Child exited!");
    println!("{}", String::from_utf8(out)?);
    Ok(())
}

And, lo and behold:

Shell session
$ cargo run --quiet
[src/main.rs:29] primary_fd = 3
[src/main.rs:29] secondary_fd = 4
Read 77 bytes...
Child exited!
Cargo.lock  Cargo.toml  src  strace.log  target

We have colors.

Uhh ahem.

Oh! Of course.

And lo and behold, we have colors:

We also have: a race condition.

If the child exits between our call to try_wait and the next read, we'll be stuck forever trying to read.

How could we possibly resolve this?

Okay so, it won't be pretty, but it won't race.

Those both sound like negatives.

We'll even run something fancier than ls, like, cargo check three times in a row, sleeping 1 second between each.

Rust code
// in `terminus/src/main.rs`
fn main() -> Result<(), Box<dyn Error>> {
    let (primary_fd, secondary_fd) = openpty();
    dbg!(primary_fd, secondary_fd);
    let mut cmd = Command::new("/bin/bash");
    cmd.arg("-c")
        .arg("for i in $(seq 1 3); do cargo check; sleep 1; done");
    unsafe {
        cmd.pre_exec(move  {
            if libc::login_tty(secondary_fd) != 0 {
                panic!("couldn't set the controlling terminal or something");
            }
            Ok(())
        })
    };
    let mut child = cmd.spawn()?;
    enum Msg {
        Output(Vec<u8>),
        Exit,
    }
    let (tx, rx) = std::sync::mpsc::channel();
    let read_tx = tx.clone();
    std::thread::spawn(move  {
        let mut primary = unsafe { File::from_raw_fd(primary_fd) };
        let mut buffer = vec![0u8; 1024];
        loop {
            let n = primary.read(&mut buffer).unwrap();
            println!("Read {} bytes...", n);
            let slice = &buffer[..n];
            read_tx.send(Msg::Output(slice.to_vec())).unwrap();
        }
    });
    std::thread::spawn(move  {
        child.wait().unwrap();
        tx.send(Msg::Exit).unwrap();
    });
    let mut out = vec![];
    loop {
        let msg = rx.recv()?;
        match msg {
            Msg::Output(buffer) => out.extend_from_slice(&buffer[..]),
            Msg::Exit => break,
        }
    }
    println!("Child exited!");
    println!("{}", String::from_utf8(out)?);
    Ok(())
}

Mhhhh. If only there was some Rust feature... that lets you deal with problems like these...

What problems?

You know. Concurrency.

Oh dear. You really think we should?

Only one way to find out...

What time is it? It's tokio time!

Channels be damned, we're bringing a whole darn executor with us. Because what is an executor? A miserable pile of secrets interests and tasks.

And we're interested in knowing when the child exits, and also when we can read from our pseudo-terminal primary.

Do you know what you're doing?

Almost never.

So, uhhh let's go:

Shell session
$ cargo add tokio@1.12.0 --features full
    Updating 'https://github.com/rust-lang/crates.io-index' index
      Adding tokio v1.12.0 to dependencies with features: ["full"]
Rust code
// in `terminus/src/main.rs`
use std::{error::Error, os::unix::prelude::FromRawFd};
use tokio::{fs::File, io::AsyncReadExt, process::Command};
fn openpty() -> (i32, i32) {
    let mut primary_fd: i32 = -1;
    let mut secondary_fd: i32 = -1;
    unsafe {
        let ret = libc::openpty(
            &mut primary_fd,
            &mut secondary_fd,
            std::ptr::null_mut(),
            std::ptr::null(),
            std::ptr::null(),
        );
        if ret != 0 {
            panic!("Failed to openpty!");
        }
    };
    (primary_fd, secondary_fd)
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
    let (primary_fd, secondary_fd) = openpty();
    dbg!(primary_fd, secondary_fd);
    let mut cmd = Command::new("/bin/bash");
    cmd.arg("-c")
        .arg("for i in $(seq 1 3); do cargo check; sleep 0.2; done");
    unsafe {
        cmd.pre_exec(move  {
            if libc::login_tty(secondary_fd) != 0 {
                panic!("couldn't set the controlling terminal or something");
            }
            Ok(())
        })
    };
    let mut child = cmd.spawn()?;
    let mut out = vec![];
    let mut buf = vec![0u8; 1024];
    let mut primary = unsafe { File::from_raw_fd(primary_fd) };
    'weee: loop {
        tokio::select! {
            n = primary.read(&mut buf) => {
                let n = n?;
                println!("Read {} bytes", n);
                out.extend_from_slice(&buf[..n]);
            },
            status = child.wait() => {
                status?;
                println!("Child exited!");
                break 'weee
            },
        }
    }
    println!("{}", String::from_utf8(out)?);
    println!("Ok we're gonna return now");
    Ok(())
}

Works like a charm. No channels or explicit threads involved. (tokio's worker threads don't count, this probably would work on the single-threaded runtime as well).

Nice, nice... but what's this ^C at the end?

Nothing.

What?

IT'S NOTHING. It's an exercise left to the reader.

But doesn't that mean we-

So the program hangs at the end of main. Big deal. We could just call std::process::exit! That would definitely cut things short.

Speaking of cutting things short, that's all I have for you today. As always, if you've liked this video and would like to see more like i- I mean, uhh, I hope you enjoyed the article, and until next time — take care!

At this point in time, my audience is large enough and varied enough that every time I write about something, folks will jump in with an extra dose of cursed knowledge. And this time was no exception!

First, Jakub Kądziołka actually did the exercise left to the reader and figured out why the async/tokio version was hanging at the end of "main". It is a very interesting read, and it showcases the new and still-experimental tokio console.

Go read it: Terminating the terminal case of Linux

Second, a whole bunch of folks mentioned that there are ways other than ioctls to, for example, tell terminal size. Apparently you can move the cursor to the bottom right corned, write \e[6n to stdout and it willwith the cursor position. For xterm-compatible terminals, you can send \e[18t and they'llwith \e[8;{rows};{cols}t, and you don't even have to move the cursor!

So there are ways to do these in-band, which comes in real handy when all you have is the stream itself, for reverse shells over the network or virtual character devices for virtual machines.

Also, vim uses that trick to determine whether a terminal is in East Asian font mode.

Re: zsh not executing the commands we echo into its pseudo-terminal, apparently there's an ioctl named TIOCSTI that lets you simulate terminal input, and thus, inject commands. Fun!

Thanks to y'all for sharing additional cursedness — old systems are always entertaining to look at.

If you liked what you saw, please support my work!

[TAG19] Donate on GitHub [TAG20] Donate on Patreon

[ comments ]