Me, Myself, and Taco: [Process Management] Process/Thread Termination Part 2

專門用語：

使用打字機（Courier New）字體表示原始碼。
task/process即為Linux中的thread。
文中的Linux與Linux kernel指的皆為作業系統核心。
使用斜體字型標示Linux中的資料結構，macro，變數或函式名稱。
PID即為TID。

Kernel版本：

v2.6.24.3

What does do_exit do?
按照正常的邏輯推理，當一個thread執行結束後作業系統需要做的事情包含將thread標示為離開作業環境，將thread佔用的資源釋出，例如檔案、lock、thread與descriptor的記憶體空間等。現在就來看看do_exit函式究竟做了哪些事情。

當一個thread執行結束之後do_exit函式會先將PF_EXITING寫入該thread task_struct的flags欄位。經過一些house keeping work之後透過exit_mm函式呼叫mmput函式將該thread所使用的記憶體空間與其descriptor釋出。


void mmput(struct mm_struct *mm)
{
        might_sleep();

        if (atomic_dec_and_test(&mm->mm_users)) {
                exit_aio(mm);
                exit_mmap(mm);
                if (!list_empty(&mm->mmlist)) {
                        spin_lock(&mmlist_lock);
                        list_del(&mm->mmlist);
                        spin_unlock(&mmlist_lock);
                }
                put_swap_token(mm);
                mmdrop(mm);
        }
}

接著釋出的是該thread使用的檔案，透過__exit_files函數呼叫put_files_struct函數將檔案關閉並且釋出descriptor空間，而接下來的__exit_fs函式則是用來釋出檔案系統中規劃給剛剛關閉的檔案用的cache。


static void __exit_files(struct task_struct *tsk)
{
        struct files_struct * files = tsk->files;

        if (files) {
                task_lock(tsk);
                tsk->files = NULL;
                task_unlock(tsk);
                put_files_struct(files);
        }
}


static void __exit_fs(struct task_struct *tsk)
{
        struct fs_struct * fs = tsk->fs;

        if (fs) {
                task_lock(tsk);
                tsk->fs = NULL;
                task_unlock(tsk);
                __put_fs_struct(fs);
        }
}

資源釋出完畢之後便呼叫exit_thread函式移除該thread使用的I/O bitmap與TSS¹。


void exit_thread(void)
{
        /* The process may have allocated an io port bitmap... nuke it. */
        if (unlikely(test_thread_flag(TIF_IO_BITMAP))) {
                struct task_struct *tsk = current;
                struct thread_struct *t = &tsk->thread;
                int cpu = get_cpu();
                struct tss_struct *tss = &per_cpu(init_tss, cpu);

                kfree(t->io_bitmap_ptr);
                t->io_bitmap_ptr = NULL;
                clear_thread_flag(TIF_IO_BITMAP);
                /*
                 * Careful, clear this in the TSS too:
                 */
                memset(tss->io_bitmap, 0xff, tss->io_bitmap_max);
                t->io_bitmap_max = 0;
                tss->io_bitmap_owner = NULL;
                tss->io_bitmap_max = 0;
                tss->x86_tss.io_bitmap_base = INVALID_IO_BITMAP_OFFSET;
                put_cpu();
        }
}

接著呼叫exit_notify函式告知作業系統中其他的threads有thread執行完畢要離開，exit_notify函式最主要的功用是通知該thread的parent與children threads，並將其children threads過戶給其他thread作為他們新的parent。exit_notify函式呼叫forget_original_parent函式將children threads過戶給同一個thread group的其他thread，若thread group只有執行結束的那個thread則將children threads過給kernel_init thread。


do {
        reaper = next_thread(reaper);
        if (reaper == father) {
                reaper = task_child_reaper(father);
                break;
        }
} while (reaper->flags & PF_EXITING);
...
list_for_each_entry_safe(p, n, &father->ptrace_children, ptrace_list) {
        p->real_parent = reaper;
        reparent_thread(p, father, 1);
}

處理完children threads之後輪到通知parent，將exit_signal設為SIGCHLD讓parent知道該thread已經執行完畢。


if (tsk->exit_signal != SIGCHLD && tsk->exit_signal != -1 &&
    ( tsk->parent_exec_id != t->self_exec_id  ||
      tsk->self_exec_id != tsk->parent_exec_id)
    && !capable(CAP_KILL))
        tsk->exit_signal = SIGCHLD;

exit_notify函式最後將該thread的狀態改為EXIT_ZOMBIE讓Linux知道有thread執行完畢²並且可以將該thread移除，同時確保該thread在被Linux移除之前不會再被挑選出來執行。執行完exit_notify函式之後do_exit函式將執行結束的thread的狀態再改為TASK_DEAD才呼叫schedule函式，將狀態改為TASK_DEAD的用意是讓稍後的schedule函式可以透過context_switch函式呼叫finish_task_switch函式中的put_task_struct函式移除該thread的task_struct。


tsk->state = TASK_DEAD;

schedule();


static void finish_task_switch(struct rq *rq, struct task_struct *prev)
        __releases(rq->lock)
{
        struct mm_struct *mm = rq->prev_mm;
        long prev_state;

        rq->prev_mm = NULL;

        /*
         * ...
         */
        prev_state = prev->state;
        finish_arch_switch(prev);
        finish_lock_switch(rq, prev);
        fire_sched_in_preempt_notifiers(current);
        if (mm)
                mmdrop(mm);
        if (unlikely(prev_state == TASK_DEAD)) {
                /*
                 * ...
                 */
                kprobe_flush_task(prev);
                put_task_struct(prev);
        }
}

明明在呼叫schedule函式之前已經明確將狀態改為TASK_DEAD為什麼最後一個判斷式裡面要用unlikely？因為完成context switch的thread不一定執行完畢，而與正常執行被context switch的次數比起來執行完畢所占的context switch次數少太多了³，所以使用unlikely預測大部分時候是不成立的以節省branching所需的時間。

1 TSS存放的是每個thread被context switch時所需要的資訊，會於稍後的memory management介紹。
2 包含釋出所有佔用的資源。
3 每個thread只會結束一次而已，但是會被context switch很多次，time slice的range為數十ms到數百ms，稍後在討論schedule的時候會加以介紹。

Me, Myself, and Taco

Search Google

Tuesday, June 24, 2008

[Process Management] Process/Thread Termination Part 2

No comments:

Online counter

Followers

Contact me @

小企鵝時鐘第二發

1k words

Labels

Blog Archive

About Me

Friendly Links

Google Analytics