在C中插入二叉搜索树答案

【问题标题】：Inserting into a binary search tree in C在C中插入二叉搜索树
【发布时间】：2021-05-01 09:08:45
【问题描述】：

我目前正在学习 C 以及一些数据结构，例如二叉搜索树等。我无法理解在某些情况下如何在函数中准确更改指针值，而在其他情况下则不行......我会附上一些我写的代码。它是一个插入函数，可将值插入 BST 中的正确位置（它应该正常工作）。我尝试使用指向指针的指针，以便能够使用函数更改值。即使它有效，我仍然很困惑它为什么会这样做。我不太明白为什么我的插入函数实际上会更改 BST，即使我只在插入函数中使用局部变量（tmp，parent_ptr），并且除了“tmp = *p2r”中的“tmp = *p2r”之外，我并没有真正取消引用任何指针插入函数。

感谢您的帮助。

#include <stdio.h>
#include <stdlib.h>


struct TreeNode{
    int val;
    struct TreeNode *left;
    struct TreeNode *right;
};

struct TreeNode** createTree(){
    struct TreeNode** p2r;
    p2r = malloc(sizeof(struct TreeNode*));
    *p2r = NULL;
    return p2r;
}

void insert(struct TreeNode** p2r, int val){
    // create TreeNode which we will insert
    struct TreeNode* new_node = malloc(sizeof(struct TreeNode));
    new_node -> val = val;
    new_node -> left = NULL;
    new_node -> right = NULL;
    //define onestep delayed pointer
    struct TreeNode* parent_ptr = NULL;
    struct TreeNode* tmp = NULL;
    tmp = *p2r;
    // find right place to insert node
    while (tmp != NULL){
        parent_ptr = tmp;
        if (tmp -> val < val) tmp = tmp->right;
        else tmp = tmp->left;
    }
    if (parent_ptr == NULL){
        *p2r = new_node;
    }
    else if (parent_ptr->val < val){ //then insert on the right
        parent_ptr -> right = new_node;
    }else{
        parent_ptr -> left = new_node;
    }
}

int main(){
    struct TreeNode **p2r = createTree();
    insert(p2r, 4);
    insert(p2r, 2);
    insert(p2r, 3);
    return 0;
}

【问题讨论】：

如果确实是你编写了这个函数，那么你应该向我们解释它是如何工作的。:)
这主要是试验和错误，我不觉得代码本身令人困惑:)

标签： c pointers binary-search-tree pass-by-reference heap-memory

【解决方案1】：

让我们一步一步分析方法。

首先我们考虑以下简单的程序。

#include <stdio.h>
#include <stdlib.h>

struct TreeNode{
    int val;
    struct TreeNode *left;
    struct TreeNode *right;
};

void create( struct TreeNode *head, int val )
{
    head = malloc( sizeof( struct TreeNode ) );
    
    head->val   = val;
    head->left  = NULL;
    head->right = NULL;
}

int main(void) 
{
    struct TreeNode *head = NULL;
    
    printf( "Before calling the function create head == NULL is %s\n",
            head == NULL ? "true" : "false" );
            
    create( head, 10 );
    
    printf( "After  calling the function create head == NULL is %s\n",
            head == NULL ? "true" : "false" );
            
    return 0;
}

程序输出是

Before calling the function create head == NULL is true
After  calling the function create head == NULL is true

如您所见，main 中的指针 head 没有改变。原因是该函数处理原始指针head 的值的副本。所以更改副本不会影响原始指针。

如果你把函数参数重命名为head_parm（为了区分原来的指针head和函数参数）那么你可以想象函数定义及其调用方式如下

create( head, 10 );

//...

void create( /*struct TreeNode *head_parm, int val */ )
{
    struct TreNode *head_parm = head;
    int val = 10;
    head_parm = malloc( sizeof( struct TreeNode ) );
    //...

即在函数内部创建了一个局部变量head_parm，该变量由参数头的值初始化，并且该函数局部变量head_parm在函数内部发生变化。

表示函数参数是按值传递的。

要更改在 main 中声明的原始指针 head，您需要通过引用传递它。

在 C 中，通过引用传递的机制是通过指向对象的指针间接传递对象来实现的。因此，在函数中取消引用指针，您将获得对原始对象的直接访问。

所以让我们用下面的方式重写上面的程序。

#include <stdio.h>
#include <stdlib.h>

struct TreeNode{
    int val;
    struct TreeNode *left;
    struct TreeNode *right;
};

void create( struct TreeNode **head, int val )
{
    *head = malloc( sizeof( struct TreeNode ) );
    
    ( *head )->val   = val;
    ( *head )->left  = NULL;
    ( *head )->right = NULL;
}

int main(void) 
{
    struct TreeNode *head = NULL;
    
    printf( "Before calling the function create head == NULL is %s\n",
            head == NULL ? "true" : "false" );
            
    create( &head, 10 );
    
    printf( "After  calling the function create head == NULL is %s\n",
            head == NULL ? "true" : "false" );
            
    return 0;
}

现在程序输出是

Before calling the function create head == NULL is true
After  calling the function create head == NULL is false

在您的问题程序中，您没有像上面的程序中那样声明指向头节点的指针

struct TreeNode *head = NULL;

你动态分配了这个指针。事实上，你在你的程序中所做的事情如下

#include <stdio.h>
#include <stdlib.h>

struct TreeNode{
    int val;
    struct TreeNode *left;
    struct TreeNode *right;
};

void create( struct TreeNode **head, int val )
{
    *head = malloc( sizeof( struct TreeNode ) );
    
    ( *head )->val   = val;
    ( *head )->left  = NULL;
    ( *head )->right = NULL;
}

int main(void) 
{
    struct TreeNode **p2r = malloc( sizeof( struct TreeNode * ) );
    *p2r = NULL;
    
    printf( "Before calling the function create *p2r == NULL is %s\n",
            *p2r == NULL ? "true" : "false" );
            
    create( p2r, 10 );
    
    printf( "After  calling the function create *p2r == NULL is %s\n",
            *p2r == NULL ? "true" : "false" );
            
    return 0;
}

程序输出是

Before calling the function create *p2r == NULL is true
After  calling the function create *p2r == NULL is false

与之前的程序相比，当您使用struct TreeNode ** 类型的表达式&head 调用函数create 时，现在引入了一个中间变量p2r，它存储表达式@987654340 的值@由于这段代码sn-p

struct TreeNode **p2r = malloc( sizeof( struct TreeNode * ) );
*p2r = NULL;

那是你早期调用函数create like

create( &head, 10 );

现在实际上你正在调用函数

struct TreeNode **p2r = &head; // where head was allocated dynamically
create( p2r, 10 );

同样的情况也发生在您的程序中。也就是说，在函数插入取消引用指针 p2r 时，您可以直接访问指向头节点的指针

if (parent_ptr == NULL){
    *p2r = new_node;
    ^^^^ 
}

因此，函数将指针更改为指向通过指针p2r 引用传递的头节点。

其他节点左右的数据成员也通过使用指针parent_ptr对它们的引用来改变

else if (parent_ptr->val < val){ //then insert on the right
    parent_ptr -> right = new_node;
    ^^^^^^^^^^^^^^^^^^^  
}else{
    parent_ptr -> left = new_node;
    ^^^^^^^^^^^^^^^^^^
}

【讨论】：

非常感谢您的详尽回答！所以我的理解是正确的，我需要根指针的地址才能改变它（根）[这就是我需要一个指向指针的指针]，但是如果我想改变树的任何其他部分，根指针就足够了..所以如果我想更改树的任何值（除了根本身），我不需要将指针传递给根[我想说我可以简单地将根指针传递给功能]？当然，当我的根指针不是全局变量并且我想更改函数中的内容时，所有这一切..
@cactus_splines 如果要更改根指针本身，则需要通过引用传递它，该引用是指向指针的指针。如果您需要更改节点的数据成员，则需要将指针传递给节点..

【解决方案2】：

虽然指针本身确实是局部变量，但它们指向内存中的特定位置。当您使用 -> 符号取消引用指针时，您基本上是在访问存储指针所指向的确切变量的内存。这就是为什么您的更改也会反映在函数之外的原因。

你基本上告诉了一个局部变量你的树存储在哪里，它帮助插入，然后它超出了范围。树本身不是局部变量，因此更改会反映在其上。

我建议阅读指针的工作原理。

【讨论】：

【解决方案3】：

首先，要始终记住关于指针的一件事，它们存储的是内存地址，而不是值。例如：

int val = 5;
int copy = val;
int *original = &val;

printf("%d\n", val);
printf("%d\n", copy);
printf("%d\n", *original);

val = 8;

printf("%d\n", val);
printf("%d\n", copy);
printf("%d\n", *original);

在执行这段代码时，输出将是

注意，如何改变val的值，copy的值保持不变，原来指向的值改变了。这是因为指针 original 指向内存位置 val。

现在，进入插入函数，虽然您只使用局部变量（tmp，parent_ptr），但请记住它们是指针变量，它们引用内存地址。所以每当在循环中，你遍历到 tmp -> right 或 tmp -> left，你实际上是在内存中以正确的顺序从一个位置跳转到另一个位置，这就是它起作用的原因。下面的例子会更清楚。

     56 (A)
     /    \
    /      \
  45 (B)  60 (C)

考虑上面的 BST，括号中是内存地址。让我们在这个 BST 中插入 40。最初，tmp 将指向 A，地址为 56。现在 40 小于 56，因此 tmp 向左移动，现在指向 B，地址为 45。再次向左移动，现在它为空。但此时 parent_ptr 指向 B。因此 40 的新节点附加到 B 的左侧。

      56 (A)
     /    \
    /      \
  45 (B)  60 (C)
  /
 /
40 (D)

【讨论】：

非常感谢！正如您所说的内存地址，您的意思是堆中 BST 的地址（因为我的 createTree 函数在堆中分配内存，并为 BST 分配了“malloc”）？在我的逻辑中，局部变量 tmp 存储在堆栈中，但它指向堆，这就是为什么在函数完成后更改 BST 的原因？如果您或任何人都可以回答，还有 1 个小问题：返回 createTree 函数中创建的“p2r”变量。当我在主函数中调用函数 createTree 并将其分配给变量时，这是否保存在堆栈中？
内存地址是指BST节点的地址。另外，对于变量“p2r”，当createTree函数返回时，“p2r”返回的地址存储在main中的“p2r”变量中，该变量驻留在函数调用Stack上。