二分查找的一种改进-拉格朗日插值查找法

标签: 编程语言

引言

二分查找算法是比较早期时候接触到的算法,这种算法有两个要求,一个是要求是顺序存储结构,其实就是数组,另一个是要查找的表要按照大小有序排列。

代码讨论

轻量级查找

我们先看下最基础的查找,基本思想就是从一个数组上面挨个找,找到了就返回:

 public static int find_v1(int[] arr, int targetValue) {
        int targetIndex = -1;
        int times = 0;
        for (int i = 0; i < arr.length; i++) {
            times++;
            if (arr[i] == targetValue) {
                targetIndex = i;
                break;
            }
        }

        if (targetIndex != -1) {
            System.out.println("find_v1找到目标值下标:arr[" + targetIndex + "]=" + targetValue + ",查找次数为:" + times);
        } else {
            System.out.println("find_v1没有找到值");
        }
        return targetIndex;
    }

二分法的引入

二分法的思想是把数据一分为二,因为数组上面的数据其实是有序的,所以可以提前判断要找的目标数据是在前半段还是后半段:

public static int find_v2(int[] arr, int targetValue) {
        int targetIndex = -1;
        int times = 0;
        int low = 0;
        int high = arr.length - 1;
        while (low <= high) {
            times++;
            int mid = low+(high - low) / 2;
            if (arr[mid] == targetValue) {
                targetIndex = mid;
                break;
            } else if(targetValue > arr[mid]){
                low = mid+1;
            }else{
                high = mid-1;
            }
        }
        if (targetIndex != -1) {
            System.out.println("find_v2找到目标值下标:arr[" + targetIndex + "]=" + targetValue + ",查找次数为:" + times);
        } else {
            System.out.println("find_v2没有找到值");
        }
        return targetIndex;
    }

二分查找中的核心就是 int mid = low+(high - low) / 2,low我们可以理解为下界,high是我们查找范围的上界,自然来说high-low其实是我们查找的范围,画图表示其实就是:
在这里插入图片描述
我们其实可以把mid表示为mid=low+(high - low) *1/2,在这里1/2只是一个比率,我们其实可以看到目标数据在靠后一点点,不在中间范围内。我们其实可以进一步考虑,这个切割的时候靠中间往后切一点是不是比较合适,更加极端的,我们假如就是237,500的地方,是不是比率直接变成0.8会比较合适。
在这里插入图片描述
这个便是我们改进算法的思想,我们求出这个分布的比率,可以进一步缩小查找的范围,关键代码如下,我们是求目标值与全局查找的范围求得我们数据的比率:

double rate=(targetValue-arr[low])*1.0/(arr[high]-arr[low]);
int mid = (int)(low+(high - low)* rate);

完整代码如下:

  public static int find_v3(int[] arr, int targetValue) {
        int targetIndex = -1;
        int times = 0;
        int low = 0;
        int high = arr.length - 1;
        while (low < high) {
            times++;
            double rate=(targetValue-arr[low])*1.0/(arr[high]-arr[low]);
            System.out.printf("rate:%.20f\n",rate);
            int mid = (int)(low+(high - low)* rate);
            System.out.println("high="+high+" low="+low+" mid="+mid);
            if (arr[mid] == targetValue) {
                targetIndex = mid;
                break;
            } else if(targetValue > arr[mid]){
                low = mid+1;
            }else{
                high = mid-1;
            }
        }

        if (targetIndex != -1) {
            System.out.println("find_v3找到目标值下标:arr[" + targetIndex + "]=" + targetValue + ",查找次数为:" + times);
        } else {
            System.out.println("find_v3没有找到值");
        }
        return targetIndex;
    }

三种情况测试对比:

public static void main(String[] args) {
        int[] array = {1, 3, 4, 6, 7, 89, 234, 235, 236, 237, 500, 501, 502, 503, 504};
        System.out.println(Arrays.toString(array));
        System.out.printf("rate:%.20f\n",501*1.0/502);
        searchAll(array,501);
        searchAll(array,502);
        searchAll(array,89);
        searchAll(array,7);

    }

    public static void searchAll(int[] array,int targetValue){
        System.out.println("searchAll begin...");
        find_v1(array, targetValue);
        find_v2(array, targetValue);
        find_v3(array, targetValue);
        System.out.println("searchAll end...");

    }
[1, 3, 4, 6, 7, 89, 234, 235, 236, 237, 500, 501, 502, 503, 504]
rate:0.99800796812749000000
searchAll begin...
find_v1找到目标值下标:arr[11]=501,查找次数为:12
find_v2找到目标值下标:arr[11]=501,查找次数为:2
rate:0.99403578528827040000
high=14 low=0 mid=13
rate:0.99800399201596800000
high=12 low=0 mid=11
find_v3找到目标值下标:arr[11]=501,查找次数为:2
searchAll end...
searchAll begin...
find_v1找到目标值下标:arr[12]=502,查找次数为:13
find_v2找到目标值下标:arr[12]=502,查找次数为:4
rate:0.99602385685884690000
high=14 low=0 mid=13
rate:1.00000000000000000000
high=12 low=0 mid=12
find_v3找到目标值下标:arr[12]=502,查找次数为:2
searchAll end...
searchAll begin...
find_v1找到目标值下标:arr[5]=89,查找次数为:6
find_v2找到目标值下标:arr[5]=89,查找次数为:3
rate:0.17495029821073560000
high=14 low=0 mid=2
rate:0.16666666666666666000
high=14 low=3 mid=4
rate:0.00000000000000000000
high=14 low=5 mid=5
find_v3找到目标值下标:arr[5]=89,查找次数为:3
searchAll end...
searchAll begin...
find_v1找到目标值下标:arr[4]=7,查找次数为:5
find_v2找到目标值下标:arr[4]=7,查找次数为:4
rate:0.01192842942345924400
high=14 low=0 mid=0
rate:0.00798403193612774400
high=14 low=1 mid=1
rate:0.00600000000000000000
high=14 low=2 mid=2
rate:0.00200803212851405600
high=14 low=3 mid=3
rate:0.00000000000000000000
high=14 low=4 mid=4
find_v3找到目标值下标:arr[4]=7,查找次数为:5
searchAll end...

版权声明:本文为zhuxuemin1991原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接:https://blog.csdn.net/zhuxuemin1991/article/details/104108978