Google Drive API：列出没有父级的文件答案

【问题标题】：Google Drive API: list files with no parentGoogle Drive API：列出没有父级的文件
【发布时间】：2012-12-24 20:43:52
【问题描述】：

我管理的 Google 域中的文件已进入错误状态；根目录中有数千个文件。我想识别这些文件并将它们移动到“我的云端硬盘”下的文件夹中。

当我使用 API 列出这些孤立文件之一的父级时，结果是一个空数组。要确定文件是否为孤立文件，我可以遍历域中的所有文件，并请求每个文件的父级列表。如果列表为空，我知道该文件是孤立的。

但这太慢了。

是否可以使用 Drive API 来搜索没有父文件的文件？

q 参数的“父母”字段似乎对此没有用处，因为它只能指定父母列表包含一些 ID。

更新：

我试图找到一种快速的方法来定位真正位于文档层次结构根部的项目。也就是说，它们是“My Drive”的兄弟姐妹，而不是“My Drive”的子代。

【问题讨论】：

这听起来像是一个错误，我们不应该允许文件没有任何父项。
驱动器 UI 明确允许您将文件移动到这种情况下，但建议不要这样做。能够查询此类文件会很棒。
您找到解决方案了吗？
@Peter Alfvin 不幸的是，在当前阶段，还不能使用 Drive API 直接检索没有父文件夹的文件。那么这两种解决方法呢？ 1. 检索所有文件，从检索到的所有文件中检索没有父文件夹的文件。 2.检索所有文件夹，检索未包含在所有文件夹中的文件。这些可以使用the files.list method 来实现。如果这不是您想要的方法，我很抱歉。顺便问一下，你想用什么语言？
@Tanaike 不幸的是，list 方法不允许您检索parents 信息。您必须使用get 才能获取父母信息，因此第一个解决方法将不起作用。第二种方法不会，因为文件可以具有您无权访问的父级，因此您无法通过枚举您有权访问的所有文件夹来派生非父级文件。关于语言，我碰巧在我想使用它的上下文中使用 Javascript，但我没有使用 Drive 库。我只是在进行 REST 调用。

标签： google-api google-drive-api google-api-client

【解决方案1】：

在 Java 中：

List<File> result = new ArrayList<File>();
Files.List request = drive.files().list();
request.setQ("'root'" + " in parents");

FileList files = null;
files = request.execute();

for (com.google.api.services.drive.model.File element : files.getItems()) {
    System.out.println(element.getTitle());
}

'root'是父文件夹，如果文件或文件夹在根目录下

【讨论】：

这会在“我的驱动器”中找到文件和文件夹，这实际上不是根目录，但令人困惑的是，“我的驱动器”文件夹具有属性 isRoot = true。我正在尝试找到一种方法来快速定位文档层次结构的实际根中的项目，即“我的驱动器”的兄弟姐妹。我已经更新了我的问题以反映这一点。
@Jasper 你能给我你使用的库的链接吗，因为我找不到任何 execute() 函数
可以使用com.google.api.services.drive.Drive获取请求，可以使用com.google.api.services.drive.Drive.Builder.Builder(HttpTransport, JsonFactory, HttpRequestInitializer)创建。

【解决方案2】：

粗鲁，但简单，而且有效..

    do {
        try {
            FileList files = request.execute();

            for (File f : files.getItems()) {
                if (f.getParents().size() == 0) {
                        System.out.println("Orphan found:\t" + f.getTitle());

                orphans.add(f);
                }
            }

            request.setPageToken(files.getNextPageToken());
        } catch (IOException e) {
            System.out.println("An error occurred: " + e);
            request.setPageToken(null);
        }
    } while (request.getPageToken() != null
            && request.getPageToken().length() > 0);

【讨论】：

【解决方案3】：

尝试在您的查询中使用它：

'root' in parents

【讨论】：

gsuite-developers.googleblog.com/2012/08/…

【解决方案4】：

documentation 推荐以下查询：is:unorganized owner:me。

【讨论】：

这不是 API 的文档。这些查询不适用于 API 调用。

【解决方案5】：

前提是：

列出所有文件。
如果文件没有“父母”字段，则表示它是孤立文件。
因此，脚本将它们删除。

在开始之前您需要：

创建OAuth id
然后您需要将权限“../auth/drive”添加到您的 OAuth id 和 validating your app against google，这样您就有了删除权限。

准备复制粘贴演示

from __future__ import print_function
import pickle
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request

# If modifying these scopes, delete the file token.pickle.
SCOPES = ['https://www.googleapis.com/auth/drive']

def callback(request_id, response, exception):
    if exception:
        print("Exception:", exception)

def main():
    """
   Description:
   Shows basic usage of the Drive v3 API to delete orphan files.
   """

    """ --- CHECK CREDENTIALS --- """
    creds = None
    # The file token.pickle stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('token.pickle'):
        with open('token.pickle', 'rb') as token:
            creds = pickle.load(token)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('token.pickle', 'wb') as token:
            pickle.dump(creds, token)

    """ --- OPEN CONNECTION --- """
    service = build('drive', 'v3', credentials=creds)

    page_token = ""
    files = None
    orphans = []
    page_size = 100
    batch_counter = 0

    print("LISTING ORPHAN FILES")
    print("-----------------------------")
    while (True):
        # List
        r = service.files().list(pageToken=page_token,
                                 pageSize=page_size,
                                 fields="nextPageToken, files"
                                 ).execute()
        page_token = r.get('nextPageToken')
        files = r.get('files', [])

        # Filter orphans
        # NOTE: (If the file has no 'parents' field, it means it's orphan)
        for file in files:
            try:
                if file['parents']:
                    print("File with a parent found.")
            except Exception as e:
                print("Orphan file found.")
                orphans.append(file['id'])

        # Exit condition
        if page_token is None:
            break

    print("DELETING ORPHAN FILES")
    print("-----------------------------")
    batch_size = min(len(orphans), 100)
    while(len(orphans) > 0):
        batch = service.new_batch_http_request(callback=callback)
        for i in range(batch_size):
            print("File with id {0} queued for deletion.".format(orphans[0]))
            batch.add(service.files().delete(fileId=orphans[0]))
            del orphans[0]
        batch.execute()
        batch_counter += 1
        print("BATCH {0} DELETED - {1} FILES DELETED".format(batch_counter,
                                                             batch_size))


if __name__ == '__main__':
    main()

此方法不会删除根目录中的文件，因为它们在字段 'parents' 中具有 'root' 值。如果没有列出您的所有孤立文件，则意味着它们正在被谷歌自动删除。此过程最多可能需要 24 小时。

【讨论】：

【解决方案6】：

Adreian Lopez，感谢您的剧本。它真的为我节省了很多手工工作。以下是我为实现您的脚本所遵循的步骤：

创建了一个文件夹c:\temp\pythonscript\ folder
使用 https://console.cloud.google.com/apis/credentials 创建 OAuth 2.0 客户端 ID，并将凭据文件下载到 c:\temp\pythonscript\ folder。
将上面的client_secret_#######-#############.apps.googleusercontent.com.json重命名为credentials.json
复制了 Adreian Lopez 的 python 脚本并保存为c:\temp\pythonscript\deleteGoogleDriveOrphanFiles.py
在 Windows 10 上转到“Microsoft Store”并安装 Python 3.8
打开命令提示符并输入：cd c:\temp\pythonscript\
运行pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib
运行python deleteGoogleDriveOrphanFiles.py 并按照屏幕上的步骤创建c:\temp\pythonscript\token.pickle 文件并开始删除孤立文件。这一步可能需要相当长的时间。
验证https://one.google.com/u/1/storage
根据需要再次运行第 8 步。

【讨论】：

在 (@) 回答中提及不会通知此人...因此无需添加 @ 符号