对于那些感兴趣的人,我发现了一个不太优雅但似乎有效的解决方案:
1) 编写脚本call_proto.sh
export LD_LIBRARY_PATH=@TARGET_PROTOBUF_LIB_DIR@:$LD_LIBRARY_PATH
@TARGET_PROTOC_EXECUTABLE@ $@
将 @TARGET_PROTOBUF_LIB_DIR@ 和 @TARGET_PROTOC_EXECUTABLE@ 替换为足够的值(包含 protoc 库和 protoc 可执行文件的完整路径的库目录)。
2) 在我之前的示例中,在 tensorflow-1.13.1/third_party/systemlibs/protobuf.BUILD 中替换:
genrule(
name = "protoc",
outs = ["protoc.bin"],
cmd = "ln -s @PATH_TO_PROTOBUF@/bin/protoc $@",
executable = 1,
visibility = ["//visibility:public"],
)
通过
genrule(
name = "protoc",
outs = ["protoc.bin"],
cmd = "ln -s @PROTO_CALL_SCRIPT@ $@",
executable = 1,
visibility = ["//visibility:public"],
)
使用@PROTO_CALL_SCRIPT@ 指向上一个脚本文件的路径...
这确实解决了将 LD_LIBRARY_PATH 传递给对 protoc 的调用的问题...
很遗憾,现在又出现了一个问题:
ERROR: /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/external/protobuf_archive/BUILD.bazel:44:1: declared output 'external/protobuf_archive/google/protobuf/any.pb.h' is a dangling symbolic link
[21 / 86] 7 actions, 6 running
Executing genrule @protobuf_archive//:link_headers [for host]; 0s local
Executing genrule @local_config_cuda//cuda:cuda-include; 0s local
@com_google_absl//absl/base:dynamic_annotations; 0s local
Executing genrule @local_config_cuda//cuda:cuda-lib [for host]; 0s local
ProtoCompile tensorflow/core/example/example.pb.cc [for host]; 0s local
ProtoCompile tensorflow/core/example/example.pb.cc; 0s local
[-----] //tensorflow/cc:ops/candidate_sampling_ops_gen_cc
ERROR: /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/external/protobuf_archive/BUILD.bazel:44:1: declared output 'external/protobuf_archive/google/protobuf/any.proto' is a dangling symbolic link
[21 / 89] 7 actions, 6 running
Executing genrule @protobuf_archive//:link_headers [for host]; 0s local
Executing genrule @local_config_cuda//cuda:cuda-include; 0s local
@com_google_absl//absl/base:dynamic_annotations; 0s local
Executing genrule @local_config_cuda//cuda:cuda-lib [for host]; 0s local
ProtoCompile tensorflow/core/example/example.pb.cc [for host]; 0s local
ProtoCompile tensorflow/core/example/example.pb.cc; 0s local
[-----] //tensorflow/cc:ops/candidate_sampling_ops_gen_cc
ERROR: /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/external/protobuf_archive/BUILD.bazel:44:1: declared output 'external/protobuf_archive/google/protobuf/arena.h' is a dangling symbolic link
[21 / 89] 7 actions, 6 running
...
相同的错误消息is a dangling symbolic link 出现在 protobuf 的所有标头中。
确实在相应的 protobu_archive 文件夹中,我有悬空的符号链接,例如:
any.pb.h -> /usr/include/google/protobuf/any.pb.h
据我了解,问题来自于 genrule link_headers:
genrule(
name = "link_headers",
outs = HEADERS,
cmd = """
echo "OUTS=$(OUTS) include=$(INCLUDEDIR)"
for i in $(OUTS); do
f=$${i#$(@D)/}
mkdir -p $(@D)/$${f%/*}
ln -sf $(INCLUDEDIR)/$$f $(@D)/$$f
done
""",
)
由于INCLUDEDIR 值(/usr/include),所有生成的符号链接都是无效的。
解决方法是修改thios规则,使其指向有效的包含文件夹:
genrule(
name = "link_headers",
outs = HEADERS,
cmd = """
for i in $(OUTS); do
f=$${i#$(@D)/}
mkdir -p $(@D)/$${f%/*}
ln -sf @TARGET_PROTOBUF_INCLUDE@/$$f $(@D)/$$f
done
""",
)
使用@TARGET_PROTOBUF_INCLUDE@ protobuf 的路径包含文件系统中的目录...现在符号链接已正确生成,悬空符号链接的错误消失了。
我有错误报告说 protobuf 的某些头文件是未知的……要解决这个问题,解决方案只是在 protobuf.BUILD 中为HEADERS 生成足够的值。我将HEADERS 的值设置为:
HEADERS = [
"google/protobuf/any.h",
"google/protobuf/any.pb.h",
"google/protobuf/any.proto",
"google/protobuf/api.pb.h",
"google/protobuf/api.proto",
"google/protobuf/arena.h",
"google/protobuf/arena_impl.h",
"google/protobuf/arenastring.h",
"google/protobuf/compiler/code_generator.h",
"google/protobuf/compiler/command_line_interface.h",
"google/protobuf/compiler/cpp/cpp_generator.h",
"google/protobuf/compiler/csharp/csharp_generator.h",
"google/protobuf/compiler/csharp/csharp_names.h",
"google/protobuf/compiler/importer.h",
"google/protobuf/compiler/java/java_generator.h",
"google/protobuf/compiler/java/java_names.h",
"google/protobuf/compiler/js/js_generator.h",
"google/protobuf/compiler/js/well_known_types_embed.h",
"google/protobuf/compiler/objectivec/objectivec_generator.h",
"google/protobuf/compiler/objectivec/objectivec_helpers.h",
"google/protobuf/compiler/parser.h",
"google/protobuf/compiler/php/php_generator.h",
"google/protobuf/compiler/plugin.h",
"google/protobuf/compiler/plugin.pb.h",
"google/protobuf/compiler/plugin.proto",
"google/protobuf/compiler/python/python_generator.h",
"google/protobuf/compiler/ruby/ruby_generator.h",
"google/protobuf/descriptor.h",
"google/protobuf/descriptor.pb.h",
"google/protobuf/descriptor.proto",
"google/protobuf/descriptor_database.h",
"google/protobuf/duration.pb.h",
"google/protobuf/duration.proto",
"google/protobuf/dynamic_message.h",
"google/protobuf/empty.pb.h",
"google/protobuf/empty.proto",
"google/protobuf/extension_set.h",
"google/protobuf/extension_set_inl.h",
"google/protobuf/field_mask.pb.h",
"google/protobuf/field_mask.proto",
"google/protobuf/generated_enum_reflection.h",
"google/protobuf/generated_enum_util.h",
"google/protobuf/generated_message_reflection.h",
"google/protobuf/generated_message_table_driven.h",
"google/protobuf/generated_message_util.h",
"google/protobuf/has_bits.h",
"google/protobuf/implicit_weak_message.h",
"google/protobuf/inlined_string_field.h",
"google/protobuf/io/coded_stream.h",
"google/protobuf/io/gzip_stream.h",
"google/protobuf/io/printer.h",
"google/protobuf/io/strtod.h",
"google/protobuf/io/tokenizer.h",
"google/protobuf/io/zero_copy_stream.h",
"google/protobuf/io/zero_copy_stream_impl.h",
"google/protobuf/io/zero_copy_stream_impl_lite.h",
"google/protobuf/map.h",
"google/protobuf/map_entry.h",
"google/protobuf/map_entry_lite.h",
"google/protobuf/map_field.h",
"google/protobuf/map_field_inl.h",
"google/protobuf/map_field_lite.h",
"google/protobuf/map_type_handler.h",
"google/protobuf/message.h",
"google/protobuf/message_lite.h",
"google/protobuf/metadata.h",
"google/protobuf/metadata_lite.h",
"google/protobuf/parse_context.h",
"google/protobuf/port.h",
"google/protobuf/port_def.inc",
"google/protobuf/port_undef.inc",
"google/protobuf/reflection.h",
"google/protobuf/reflection_ops.h",
"google/protobuf/repeated_field.h",
"google/protobuf/service.h",
"google/protobuf/source_context.pb.h",
"google/protobuf/source_context.proto",
"google/protobuf/struct.pb.h",
"google/protobuf/struct.proto",
"google/protobuf/stubs/bytestream.h",
"google/protobuf/stubs/callback.h",
"google/protobuf/stubs/casts.h",
"google/protobuf/stubs/common.h",
"google/protobuf/stubs/fastmem.h",
"google/protobuf/stubs/hash.h",
"google/protobuf/stubs/logging.h",
"google/protobuf/stubs/macros.h",
"google/protobuf/stubs/mutex.h",
"google/protobuf/stubs/once.h",
"google/protobuf/stubs/platform_macros.h",
"google/protobuf/stubs/port.h",
"google/protobuf/stubs/status.h",
"google/protobuf/stubs/stl_util.h",
"google/protobuf/stubs/stringpiece.h",
"google/protobuf/stubs/strutil.h",
"google/protobuf/stubs/template_util.h",
"google/protobuf/text_format.h",
"google/protobuf/timestamp.pb.h",
"google/protobuf/timestamp.proto",
"google/protobuf/type.pb.h",
"google/protobuf/type.proto",
"google/protobuf/unknown_field_set.h",
"google/protobuf/util/delimited_message_util.h",
"google/protobuf/util/field_comparator.h",
"google/protobuf/util/field_mask_util.h",
"google/protobuf/util/json_util.h",
"google/protobuf/util/message_differencer.h",
"google/protobuf/util/time_util.h",
"google/protobuf/util/type_resolver.h",
"google/protobuf/util/type_resolver_util.h",
"google/protobuf/wire_format.h",
"google/protobuf/wire_format_lite.h",
"google/protobuf/wire_format_lite_inl.h",
"google/protobuf/wrappers.pb.h",
"google/protobuf/wrappers.proto",
]
这只是 protobuf 包含目录中包含的文件列表。经过检查,我现在可以说我的特定 protobuf 安装的所有符号链接都已放入 bazel-genfiles/external/protobuf_archive。
当时我非常自信,但可能是因为我有一个新错误:
ERROR: .../tensorflow-1.13.1/tensorflow/stream_executor/BUILD:18:1: C++ compilation of rule '//tensorflow/stream_executor:dnn_proto_cc_impl' failed (Exit 1): crosstool_wrapper_driver_is_not_gcc failed: error executing command
(cd /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/execroot/org_tensorflow && \
exec env - \
PATH=/bin:/usr/bin \
PWD=/proc/self/cwd \
external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -MD -MF bazel-out/host/bin/tensorflow/stream_executor/_objs/dnn_proto_cc_impl/dnn.pb.pic.d '-frandom-seed=bazel-out/host/bin/tensorflow/stream_executor/_objs/dnn_proto_cc_impl/dnn.pb.pic.o' -iquote . -iquote bazel-out/host/genfiles -iquote bazel-out/host/bin -iquote external/protobuf_archive -iquote bazel-out/host/genfiles/external/protobuf_archive -iquote bazel-out/host/bin/external/protobuf_archive '-std=c++11' -Wno-builtin-macro-redefined '-D__DATE__="redacted"' '-D__TIMESTAMP__="redacted"' '-D__TIME__="redacted"' -fPIC -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -Wall -fno-omit-frame-pointer -no-canonical-prefixes -fno-canonical-system-headers -DNDEBUG -g0 -O2 -ffunction-sections -fdata-sections -g0 '-march=native' -g0 -Wno-unknown-warning-option -Wno-unused-but-set-variable -Wno-sign-compare -c bazel-out/host/genfiles/tensorflow/stream_executor/dnn.pb.cc -o bazel-out/host/bin/tensorflow/stream_executor/_objs/dnn_proto_cc_impl/dnn.pb.pic.o)
Execution platform: @bazel_tools//platforms:host_platform
[255 / 752] 9 actions running
@local_config_cuda//cuda:cuda-include; 5s local
@com_google_absl//absl/base:base; 0s local
Compiling tensorflow/stream_executor/dnn.pb.cc [for host]; 0s local
Executing genrule @jpeg//:simd_x86_64_assemblage23 [for host]; 0s local
Compiling .../costs/op_performance_data.pb.cc [for host]; 0s local
Compiling .../core/protobuf/transport_options.pb.cc [for host]; 0s local
Compiling .../core/protobuf/rewriter_config.pb.cc [for host]; 0s local
Compiling tensorflow/core/framework/types.pb.cc [for host]; 0s local ...
In file included from bazel-out/host/genfiles/tensorflow/stream_executor/dnn.pb.cc:4:0:
bazel-out/host/genfiles/tensorflow/stream_executor/dnn.pb.h:10:10: fatal error: google/protobuf/port_def.inc: No such file or directory
#include <google/protobuf/port_def.inc>
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
再次包含问题,但不在同一目标中。这次我真的不明白这是怎么回事。确实有选项 -iquote bazel-out/host/genfiles/external/protobuf_archive 传递给调用,相应的文件夹真正包含先前生成的符号链接 google/protobuf/port_def.inc。
我认为问题来自传递给编译器的-iquote 标志。它应该是-I,不是吗?我该如何解决这个问题?
---- 编辑----
解决方案是为包含文件夹添加包含指令,如下所示:
cc_library(
name = "protobuf",
hdrs = HEADERS,
includes = ["."],
linkopts = ["@TARGET_PROTOBUF_LIB@"],
visibility = ["//visibility:public"],
)
cc_library(
name = "protobuf_headers",
hdrs = HEADERS,
includes = ["."],
linkopts = ["@TARGET_PROTOBUF_LIB@"],
visibility = ["//visibility:public"],
)
备注:根据 bazel doc,显然使用 includes = ["."], 是一种反模式,但这是我发现使其工作的唯一方法......
最后我回到第一个错误:
ERROR: /home/robin/soft/PID/pid-workspace/wrappers/tensorflow/build/1.13.1/tensorflow-1.13.1/tensorflow/core/BUILD:2489:1: Executing genrule //tensorflow/core:protos_all_proto_text_srcs failed (Exit 127): bash failed: error executing command
(cd /home/robin/.cache/bazel/_bazel_robin/c04a70144cd329180403af87e4cbdc44/execroot/org_tensorflow && \
exec env - \
PATH=/bin:/usr/bin \
/bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; bazel-out/host/bin/tensorflow/tools/proto_text/gen_proto_text_functions bazel-out/host/genfiles/tensorflow/core tensorflow/core/ tensorflow/core/example/example.proto tensorflow/core/example/feature.proto tensorflow/core/framework/allocation_description.proto tensorflow/core/framework/api_def.proto tensorflow/core/framework/attr_value.proto tensorflow/core/framework/cost_graph.proto tensorflow/core/framework/device_attributes.proto tensorflow/core/framework/function.proto tensorflow/core/framework/graph.proto tensorflow/core/framework/graph_transfer_info.proto tensorflow/core/framework/iterator.proto tensorflow/core/framework/kernel_def.proto tensorflow/core/framework/log_memory.proto tensorflow/core/framework/node_def.proto tensorflow/core/framework/op_def.proto tensorflow/core/framework/reader_base.proto tensorflow/core/framework/remote_fused_graph_execute_info.proto tensorflow/core/framework/resource_handle.proto tensorflow/core/framework/step_stats.proto tensorflow/core/framework/summary.proto tensorflow/core/framework/tensor.proto tensorflow/core/framework/tensor_description.proto tensorflow/core/framework/tensor_shape.proto tensorflow/core/framework/tensor_slice.proto tensorflow/core/framework/types.proto tensorflow/core/framework/variable.proto tensorflow/core/framework/versions.proto tensorflow/core/protobuf/config.proto tensorflow/core/protobuf/cluster.proto tensorflow/core/protobuf/debug.proto tensorflow/core/protobuf/device_properties.proto tensorflow/core/protobuf/queue_runner.proto tensorflow/core/protobuf/rewriter_config.proto tensorflow/core/protobuf/tensor_bundle.proto tensorflow/core/protobuf/saver.proto tensorflow/core/util/event.proto tensorflow/core/util/memmapped_file_system.proto tensorflow/core/util/saved_tensor_slice.proto tensorflow/core/lib/core/error_codes.proto tensorflow/tools/proto_text/placeholder.txt')
Execution platform: @bazel_tools//platforms:host_platform
[834 / 1,712] 9 actions running
@com_google_absl//absl/strings:strings; 1s local
@com_google_absl//absl/strings:strings; 0s local
@com_google_absl//absl/strings:strings; 0s local
@com_google_absl//absl/types:optional; 0s local
@com_google_absl//absl/types:optional; 0s local
@com_google_absl//absl/strings:strings; 0s local
Executing genrule //tensorflow/core:protos_all_proto_text_srcs; 0s local
//tensorflow/core:protos_all_proto_text_srcs; 0s local ...
bazel-out/host/bin/tensorflow/tools/proto_text/gen_proto_text_functions: error while loading shared libraries: libprotobuf.so.18: cannot open shared object file: No such file or directory
原因和第一次一样:没有给命令LD_LIBRARY_PATH。但是那个时候我不知道我应该修改哪个文件来做到这一点......更一般地说,我担心每当构建过程调用使用 libprotobuf/libprotoc 的可执行文件时都会出现这种情况。所以我的问题:有没有办法通过修改 tensorflow 项目中的一个或多个文件一劳永逸地解决这个问题(将 LD_LIBRARY_PATH 传递给环境)?还是 bazel 内部的问题?