Synchronous Ethernet networks use a physical layer clock to syntonize the frequency across different network elements.
Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet Equipment Clock (EEC) and have the ability to recover synchronization from the synchronization inputs - either traffic interfaces or external frequency sources. The EEC can synchronize its frequency (syntonize) to any of those sources. It is also able to select synchronization source through priority tables and synchronization status messaging. It also provides neccessary filtering and holdover capabilities
This patch series introduces basic interface for reading the Ethernet Equipment Clock (EEC) state on a SyncE capable device. This state gives information about the source of the syntonization signal (ether my port, or any external one) and the state of EEC. This interface is required\ to implement Synchronization Status Messaging on upper layers.
v3: - remove RTM_GETRCLKRANGE - return state of all possible pins in the RTM_GETRCLKSTATE - clarify documentation
v2: - improved documentation - fixed kdoc warning
RFC history: v2: - removed whitespace changes - fix issues reported by test robot v3: - Changed naming from SyncE to EEC - Clarify cover letter and commit message for patch 1 v4: - Removed sync_source and pin_idx info - Changed one structure to attributes - Added EEC_SRC_PORT flag to indicate that the EEC is synchronized to the recovered clock of a port that returns the state v5: - add EEC source as an optiona attribute - implement support for recovered clocks - align states returned by EEC to ITU-T G.781 v6: - fix EEC clock state reporting - add documentation - fix descriptions in code comments
Maciej Machnikowski (6): ice: add support detecting features based on netlist rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status ice: add support for reading SyncE DPLL state rtnetlink: Add support for SyncE recovered clock configuration ice: add support for SyncE recovered clocks docs: net: Add description of SyncE interfaces
Documentation/networking/synce.rst | 124 ++++++++++ drivers/net/ethernet/intel/ice/ice.h | 7 + .../net/ethernet/intel/ice/ice_adminq_cmd.h | 94 +++++++- drivers/net/ethernet/intel/ice/ice_common.c | 224 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 20 +- drivers/net/ethernet/intel/ice/ice_devids.h | 3 + drivers/net/ethernet/intel/ice/ice_lib.c | 6 +- drivers/net/ethernet/intel/ice/ice_main.c | 137 +++++++++++ drivers/net/ethernet/intel/ice/ice_ptp.c | 34 +++ drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 49 ++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 22 ++ drivers/net/ethernet/intel/ice/ice_type.h | 1 + include/linux/netdevice.h | 33 +++ include/uapi/linux/if_link.h | 49 ++++ include/uapi/linux/rtnetlink.h | 16 +- net/core/rtnetlink.c | 189 +++++++++++++++ security/selinux/nlmsgtab.c | 5 +- 17 files changed, 1005 insertions(+), 8 deletions(-) create mode 100644 Documentation/networking/synce.rst
Add new functions to check netlist of a given board for: - Recovered Clock device, - Clock Generation Unit, - Clock Multiplexer,
Initialize feature bits depending on detected components.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- drivers/net/ethernet/intel/ice/ice.h | 2 + .../net/ethernet/intel/ice/ice_adminq_cmd.h | 7 +- drivers/net/ethernet/intel/ice/ice_common.c | 123 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 9 ++ drivers/net/ethernet/intel/ice/ice_lib.c | 6 +- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 1 + drivers/net/ethernet/intel/ice/ice_type.h | 1 + 7 files changed, 147 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index bf4ecd9a517c..3dc4caa41565 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -186,6 +186,8 @@
enum ice_feature { ICE_F_DSCP, + ICE_F_CGU, + ICE_F_PHY_RCLK, ICE_F_SMA_CTRL, ICE_F_MAX }; diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 4eef3488d86f..339c2a86f680 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1297,6 +1297,8 @@ struct ice_aqc_link_topo_params { #define ICE_AQC_LINK_TOPO_NODE_TYPE_CAGE 6 #define ICE_AQC_LINK_TOPO_NODE_TYPE_MEZZ 7 #define ICE_AQC_LINK_TOPO_NODE_TYPE_ID_EEPROM 8 +#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL 9 +#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX 10 #define ICE_AQC_LINK_TOPO_NODE_CTX_S 4 #define ICE_AQC_LINK_TOPO_NODE_CTX_M \ (0xF << ICE_AQC_LINK_TOPO_NODE_CTX_S) @@ -1333,7 +1335,10 @@ struct ice_aqc_link_topo_addr { struct ice_aqc_get_link_topo { struct ice_aqc_link_topo_addr addr; u8 node_part_num; -#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575 0x21 +#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575 0x21 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032 0x24 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL 0x31 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX 0x47 u8 rsvd[9]; };
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index b3066d0fea8b..35903b282885 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -274,6 +274,79 @@ ice_aq_get_link_topo_handle(struct ice_port_info *pi, u8 node_type, return ice_aq_send_cmd(pi->hw, &desc, NULL, 0, cd); }
+/** + * ice_aq_get_netlist_node + * @hw: pointer to the hw struct + * @cmd: get_link_topo AQ structure + * @node_part_number: output node part number if node found + * @node_handle: output node handle parameter if node found + */ +enum ice_status +ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd, + u8 *node_part_number, u16 *node_handle) +{ + struct ice_aq_desc desc; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_link_topo); + desc.params.get_link_topo = *cmd; + + if (ice_aq_send_cmd(hw, &desc, NULL, 0, NULL)) + return ICE_ERR_NOT_SUPPORTED; + + if (node_handle) + *node_handle = + le16_to_cpu(desc.params.get_link_topo.addr.handle); + if (node_part_number) + *node_part_number = desc.params.get_link_topo.node_part_num; + + return ICE_SUCCESS; +} + +#define MAX_NETLIST_SIZE 10 +/** + * ice_find_netlist_node + * @hw: pointer to the hw struct + * @node_type_ctx: type of netlist node to look for + * @node_part_number: node part number to look for + * @node_handle: output parameter if node found - optional + * + * Find and return the node handle for a given node type and part number in the + * netlist. When found ICE_SUCCESS is returned, ICE_ERR_DOES_NOT_EXIST + * otherwise. If @node_handle provided, it would be set to found node handle. + */ +enum ice_status +ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number, + u16 *node_handle) +{ + struct ice_aqc_get_link_topo cmd; + u8 rec_node_part_number; + enum ice_status status; + u16 rec_node_handle; + u8 idx; + + for (idx = 0; idx < MAX_NETLIST_SIZE; idx++) { + memset(&cmd, 0, sizeof(cmd)); + + cmd.addr.topo_params.node_type_ctx = + (node_type_ctx << ICE_AQC_LINK_TOPO_NODE_TYPE_S); + cmd.addr.topo_params.index = idx; + + status = ice_aq_get_netlist_node(hw, &cmd, + &rec_node_part_number, + &rec_node_handle); + if (status) + return status; + + if (rec_node_part_number == node_part_number) { + if (node_handle) + *node_handle = rec_node_handle; + return ICE_SUCCESS; + } + } + + return ICE_ERR_DOES_NOT_EXIST; +} + /** * ice_is_media_cage_present * @pi: port information structure @@ -5083,3 +5156,53 @@ bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw) } return false; } + +/** + * ice_is_phy_rclk_present_e810t + * @hw: pointer to the hw struct + * + * Check if the PHY Recovered Clock device is present in the netlist + */ +bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw) +{ + if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL, NULL)) + return false; + + return true; +} + +/** + * ice_is_cgu_present_e810t + * @hw: pointer to the hw struct + * + * Check if the Clock Generation Unit (CGU) device is present in the netlist + */ +bool ice_is_cgu_present_e810t(struct ice_hw *hw) +{ + if (!ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032, + NULL)) { + hw->cgu_part_number = + ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032; + return true; + } + return false; +} + +/** + * ice_is_clock_mux_present_e810t + * @hw: pointer to the hw struct + * + * Check if the Clock Multiplexer device is present in the netlist + */ +bool ice_is_clock_mux_present_e810t(struct ice_hw *hw) +{ + if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX, + NULL)) + return false; + + return true; +} + diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index 65c1b3244264..b20a5c085246 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -89,6 +89,12 @@ ice_aq_get_phy_caps(struct ice_port_info *pi, bool qual_mods, u8 report_mode, struct ice_aqc_get_phy_caps_data *caps, struct ice_sq_cd *cd); enum ice_status +ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd, + u8 *node_part_number, u16 *node_handle); +enum ice_status +ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number, + u16 *node_handle); +enum ice_status ice_aq_list_caps(struct ice_hw *hw, void *buf, u16 buf_size, u32 *cap_count, enum ice_adminq_opc opc, struct ice_sq_cd *cd); enum ice_status @@ -206,4 +212,7 @@ bool ice_fw_supports_lldp_fltr_ctrl(struct ice_hw *hw); enum ice_status ice_lldp_fltr_add_remove(struct ice_hw *hw, u16 vsi_num, bool add); bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw); +bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw); +bool ice_is_cgu_present_e810t(struct ice_hw *hw); +bool ice_is_clock_mux_present_e810t(struct ice_hw *hw); #endif /* _ICE_COMMON_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 40562600a8cf..2422215b7937 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -4183,8 +4183,12 @@ void ice_init_feature_support(struct ice_pf *pf) case ICE_DEV_ID_E810C_QSFP: case ICE_DEV_ID_E810C_SFP: ice_set_feature_support(pf, ICE_F_DSCP); - if (ice_is_e810t(&pf->hw)) + if (ice_is_clock_mux_present_e810t(&pf->hw)) ice_set_feature_support(pf, ICE_F_SMA_CTRL); + if (ice_is_phy_rclk_present_e810t(&pf->hw)) + ice_set_feature_support(pf, ICE_F_PHY_RCLK); + if (ice_is_cgu_present_e810t(&pf->hw)) + ice_set_feature_support(pf, ICE_F_CGU); break; default: break; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index 29f947c0cd2e..aa257db36765 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -800,3 +800,4 @@ bool ice_is_pca9575_present(struct ice_hw *hw)
return !status && handle; } + diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index 9e0c2923c62e..a9dc16641bd4 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -920,6 +920,7 @@ struct ice_hw { struct list_head rss_list_head; struct ice_mbx_snapshot mbx_snapshot; u16 io_expander_handle; + u8 cgu_part_number; };
/* Statistics collected by each port, VSI, VEB, and S-channel */
This patch series introduces basic interface for reading the Ethernet Equipment Clock (EEC) state on a SyncE capable device. This state gives information about the state of EEC. This interface is required to implement Synchronization Status Messaging on upper layers.
Initial implementation returns SyncE EEC state in the IFLA_EEC_STATE attribute. The optional index of input that's used as a source can be returned in the IFLA_EEC_SRC_IDX attribute.
SyncE EEC state read needs to be implemented as a ndo_get_eec_state function. The index will be read by calling the ndo_get_eec_src.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- include/linux/netdevice.h | 13 ++++++ include/uapi/linux/if_link.h | 27 ++++++++++++ include/uapi/linux/rtnetlink.h | 3 ++ net/core/rtnetlink.c | 79 ++++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 3 +- 5 files changed, 124 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 3ec42495a43a..ef2b381dae0c 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1344,6 +1344,13 @@ struct netdev_net_notifier { * The caller must be under RCU read context. * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); * Get the forwarding path to reach the real device from the HW destination address + * int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, + * u32 *src_idx, struct netlink_ext_ack *extack); + * Get state of physical layer frequency synchronization (SyncE) + * int (*ndo_get_eec_src)(struct net_device *dev, u32 *src, + * struct netlink_ext_ack *extack); + * Get the index of the source signal that's currently used as EEC's + * reference */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); @@ -1563,6 +1570,12 @@ struct net_device_ops { struct net_device * (*ndo_get_peer_dev)(struct net_device *dev); int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); + int (*ndo_get_eec_state)(struct net_device *dev, + enum if_eec_state *state, + struct netlink_ext_ack *extack); + int (*ndo_get_eec_src)(struct net_device *dev, + u32 *src, + struct netlink_ext_ack *extack); };
/** diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index eebd3894fe89..3628a55fdd10 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1273,4 +1273,31 @@ enum {
#define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
+/* SyncE section */ + +enum if_eec_state { + IF_EEC_STATE_INVALID = 0, /* state is not valid */ + IF_EEC_STATE_FREERUN, /* clock is free-running */ + IF_EEC_STATE_LOCKED, /* clock is locked to the reference, + * but the holdover memory is not valid + */ + IF_EEC_STATE_LOCKED_HO_ACQ, /* clock is locked to the reference + * and holdover memory is valid + */ + IF_EEC_STATE_HOLDOVER, /* clock is in holdover mode */ +}; + +struct if_eec_state_msg { + __u32 ifindex; +}; + +enum { + IFLA_EEC_UNSPEC, + IFLA_EEC_STATE, + IFLA_EEC_SRC_IDX, + __IFLA_EEC_MAX, +}; + +#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1) + #endif /* _UAPI_LINUX_IF_LINK_H */ diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
+ RTM_GETEECSTATE = 124, +#define RTM_GETEECSTATE RTM_GETEECSTATE + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2af8aeeadadf..03bc773d0e69 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5467,6 +5467,83 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb) return skb->len; }
+static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev, + u32 portid, u32 seq, struct netlink_callback *cb, + int flags, struct netlink_ext_ack *extack) +{ + const struct net_device_ops *ops = dev->netdev_ops; + struct if_eec_state_msg *state_msg; + enum if_eec_state state; + struct nlmsghdr *nlh; + u32 src_idx; + int err; + + ASSERT_RTNL(); + + if (!ops->ndo_get_eec_state) + return -EOPNOTSUPP; + + err = ops->ndo_get_eec_state(dev, &state, extack); + if (err) + return err; + + nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE, sizeof(*state_msg), + flags); + if (!nlh) + return -EMSGSIZE; + + state_msg = nlmsg_data(nlh); + state_msg->ifindex = dev->ifindex; + + if (nla_put_u32(skb, IFLA_EEC_STATE, state)) + return -EMSGSIZE; + + if (!ops->ndo_get_eec_src) + goto end_msg; + + err = ops->ndo_get_eec_src(dev, &src_idx, extack); + if (err) + return err; + + if (nla_put_u32(skb, IFLA_EEC_SRC_IDX, src_idx)) + return -EMSGSIZE; + +end_msg: + nlmsg_end(skb, nlh); + return 0; +} + +static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_eec_state_msg *state; + struct net_device *dev; + struct sk_buff *nskb; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!nskb) + return -ENOBUFS; + + err = rtnl_fill_eec_state(nskb, dev, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NULL, nlh->nlmsg_flags, + extack); + if (err < 0) + kfree_skb(nskb); + else + err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); + + return err; +} + /* Process one rtnetlink message. */
static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5692,4 +5769,6 @@ void __init rtnetlink_init(void)
rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0); + + rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0); } diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 94ea2a8b2bb7..2c66e722ea9c 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -91,6 +91,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_NEWNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_GETEECSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, };
static const struct nlmsg_perm nlmsg_tcpdiag_perms[] = @@ -176,7 +177,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWNEXTHOPBUCKET + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_GETEECSTATE + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break;
Hello Maciej,
2021-11-10, 12:44:44 +0100, Maciej Machnikowski wrote:
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
- RTM_GETEECSTATE = 124,
+#define RTM_GETEECSTATE RTM_GETEECSTATE
I'm not sure about this. All the other RTM_GETxxx are such that RTM_GETxxx % 4 == 2. Following the current pattern, 124 should be reserved for RTM_NEWxxx, and RTM_GETEECSTATE would be 126.
Also, why are you leaving a gap (which you end up filling in patch 4/6)?
- __RTM_MAX,
#define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2af8aeeadadf..03bc773d0e69 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5467,6 +5467,83 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb) return skb->len; } +static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev,
u32 portid, u32 seq, struct netlink_callback *cb,
int flags, struct netlink_ext_ack *extack)
+{
[...]
- nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE, sizeof(*state_msg),
flags);
- if (!nlh)
return -EMSGSIZE;
- state_msg = nlmsg_data(nlh);
- state_msg->ifindex = dev->ifindex;
Why stuff this in a struct instead of using an attribute?
- if (nla_put_u32(skb, IFLA_EEC_STATE, state))
return -EMSGSIZE;
- if (!ops->ndo_get_eec_src)
goto end_msg;
- err = ops->ndo_get_eec_src(dev, &src_idx, extack);
- if (err)
return err;
- if (nla_put_u32(skb, IFLA_EEC_SRC_IDX, src_idx))
return -EMSGSIZE;
+end_msg:
- nlmsg_end(skb, nlh);
- return 0;
+}
Thanks,
Sabrina Dubroca sd@queasysnail.net wrote:
Hello Maciej,
2021-11-10, 12:44:44 +0100, Maciej Machnikowski wrote:
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
- RTM_GETEECSTATE = 124,
+#define RTM_GETEECSTATE RTM_GETEECSTATE
I'm not sure about this. All the other RTM_GETxxx are such that RTM_GETxxx % 4 == 2. Following the current pattern, 124 should be reserved for RTM_NEWxxx, and RTM_GETEECSTATE would be 126.
More importantly, why is this added to rtnetlink (routing sockets)? It appears to be unrelated?
Looks like this should be in ethtool (it has netlink api nowadays) or devlink.
-----Original Message----- From: Florian Westphal fw@strlen.de Sent: Thursday, November 11, 2021 5:23 PM To: Sabrina Dubroca sd@queasysnail.net Subject: Re: [PATCH v3 net-next 2/6] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
Sabrina Dubroca sd@queasysnail.net wrote:
Hello Maciej,
2021-11-10, 12:44:44 +0100, Maciej Machnikowski wrote:
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
- RTM_GETEECSTATE = 124,
+#define RTM_GETEECSTATE RTM_GETEECSTATE
I'm not sure about this. All the other RTM_GETxxx are such that RTM_GETxxx % 4 == 2. Following the current pattern, 124 should be reserved for RTM_NEWxxx, and RTM_GETEECSTATE would be 126.
More importantly, why is this added to rtnetlink (routing sockets)? It appears to be unrelated?
Looks like this should be in ethtool (it has netlink api nowadays) or devlink.
We identified it as a generic place in previous RFCs. Ethtool calls are not available in non-ethernet packet networks and the concept of that functionality is - any packet network can implement it - SONET, GPON or even wireless.
Machnikowski, Maciej maciej.machnikowski@intel.com wrote:
More importantly, why is this added to rtnetlink (routing sockets)? It appears to be unrelated?
Looks like this should be in ethtool (it has netlink api nowadays) or devlink.
We identified it as a generic place in previous RFCs.
Doesn't answer my question. EECSTATE doesn't appear to be related to anything else thats currently exposed via rtnetlink from a conceptional point of view.
Ethtool calls are not available in non-ethernet packet networks
Thats news to me. ethtool ops are linked via netdevice struct.
and the concept of that functionality is - any packet network can implement it - SONET, GPON or even wireless.
Ethtool ops expose a wide range of low-level functions not related to ethernet, e.g. eeprom dump, interrupt coalescing settings of and so on and so forth.
But hey, if net maintainers are ok with rtnetlink...
I just feel putting synce interaction in rtnetlink is arbitrary and bad precendence.
On Tue, 16 Nov 2021 16:41:11 +0100 Florian Westphal wrote:
and the concept of that functionality is - any packet network can implement it - SONET, GPON or even wireless.
Ethtool ops expose a wide range of low-level functions not related to ethernet, e.g. eeprom dump, interrupt coalescing settings of and so on and so forth.
But hey, if net maintainers are ok with rtnetlink...
I agree, this has been brought up by 5 people or so already.
-----Original Message----- From: Sabrina Dubroca sd@queasysnail.net Sent: Thursday, November 11, 2021 5:01 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH v3 net-next 2/6] rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status
Hello Maciej,
2021-11-10, 12:44:44 +0100, Maciej Machnikowski wrote:
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
- RTM_GETEECSTATE = 124,
+#define RTM_GETEECSTATE RTM_GETEECSTATE
I'm not sure about this. All the other RTM_GETxxx are such that RTM_GETxxx % 4 == 2. Following the current pattern, 124 should be reserved for RTM_NEWxxx, and RTM_GETEECSTATE would be 126.
Also, why are you leaving a gap (which you end up filling in patch 4/6)?
Hmmm I missed that - is there any guide how to number them? I'd be happy to follow the pattern there - will fix in next revision.
The gap is there as this was developed first - but most likely this part Will be removed in next revision in favor of DPLL subsystem.
- __RTM_MAX,
#define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2af8aeeadadf..03bc773d0e69 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5467,6 +5467,83 @@ static int rtnl_stats_dump(struct sk_buff *skb,
struct netlink_callback *cb)
return skb->len; }
+static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev,
u32 portid, u32 seq, struct netlink_callback *cb,
int flags, struct netlink_ext_ack *extack)
+{
[...]
- nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE,
sizeof(*state_msg),
flags);
- if (!nlh)
return -EMSGSIZE;
- state_msg = nlmsg_data(nlh);
- state_msg->ifindex = dev->ifindex;
Why stuff this in a struct instead of using an attribute?
Since it's the required parameter to identify what port is in question.
- if (nla_put_u32(skb, IFLA_EEC_STATE, state))
return -EMSGSIZE;
- if (!ops->ndo_get_eec_src)
goto end_msg;
- err = ops->ndo_get_eec_src(dev, &src_idx, extack);
- if (err)
return err;
- if (nla_put_u32(skb, IFLA_EEC_SRC_IDX, src_idx))
return -EMSGSIZE;
+end_msg:
- nlmsg_end(skb, nlh);
- return 0;
+}
Thanks,
-- Sabrina
Implement SyncE DPLL monitoring for E810-T devices. Poll loop will periodically check the state of the DPLL and cache it in the pf structure. State changes will be logged in the system log.
Cached state can be read using the RTM_GETEECSTATE rtnetlink message.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- drivers/net/ethernet/intel/ice/ice.h | 5 ++ .../net/ethernet/intel/ice/ice_adminq_cmd.h | 34 +++++++++++++ drivers/net/ethernet/intel/ice/ice_common.c | 36 ++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 5 +- drivers/net/ethernet/intel/ice/ice_devids.h | 3 ++ drivers/net/ethernet/intel/ice/ice_main.c | 46 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp.c | 34 +++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 48 +++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 22 +++++++++ 9 files changed, 232 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 3dc4caa41565..1dff7ca704d4 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -609,6 +609,11 @@ struct ice_pf { #define ICE_VF_AGG_NODE_ID_START 65 #define ICE_MAX_VF_AGG_NODES 32 struct ice_agg_node vf_agg_node[ICE_MAX_VF_AGG_NODES]; + + enum if_eec_state synce_dpll_state; + u8 synce_dpll_pin; + enum if_eec_state ptp_dpll_state; + u8 ptp_dpll_pin; };
struct ice_netdev_priv { diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 339c2a86f680..11226af7a9a4 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1808,6 +1808,36 @@ struct ice_aqc_add_rdma_qset_data { struct ice_aqc_add_tx_rdma_qset_entry rdma_qsets[]; };
+/* Get CGU DPLL status (direct 0x0C66) */ +struct ice_aqc_get_cgu_dpll_status { + u8 dpll_num; + u8 ref_state; +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_LOS BIT(0) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_SCM BIT(1) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_CFM BIT(2) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_GST BIT(3) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_PFM BIT(4) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_ESYNC BIT(6) +#define ICE_AQC_GET_CGU_DPLL_STATUS_FAST_LOCK_EN BIT(7) + __le16 dpll_state; +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK BIT(0) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO BIT(1) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY BIT(2) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_FLHIT BIT(5) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_PSLHIT BIT(7) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT 8 +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL \ + ICE_M(0x1F, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT 13 +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE \ + ICE_M(0x7, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT) + __le32 phase_offset_h; + __le32 phase_offset_l; + u8 eec_mode; + u8 rsvd[1]; + __le16 node_handle; +}; + /* Configure Firmware Logging Command (indirect 0xFF09) * Logging Information Read Response (indirect 0xFF10) * Note: The 0xFF10 command has no input parameters. @@ -2039,6 +2069,7 @@ struct ice_aq_desc { struct ice_aqc_fw_logging fw_logging; struct ice_aqc_get_clear_fw_log get_clear_fw_log; struct ice_aqc_download_pkg download_pkg; + struct ice_aqc_get_cgu_dpll_status get_cgu_dpll_status; struct ice_aqc_driver_shared_params drv_shared_params; struct ice_aqc_set_mac_lb set_mac_lb; struct ice_aqc_alloc_free_res_cmd sw_res_ctrl; @@ -2205,6 +2236,9 @@ enum ice_adminq_opc { ice_aqc_opc_update_pkg = 0x0C42, ice_aqc_opc_get_pkg_info_list = 0x0C43,
+ /* 1588/SyncE commands/events */ + ice_aqc_opc_get_cgu_dpll_status = 0x0C66, + ice_aqc_opc_driver_shared_params = 0x0C90,
/* Standalone Commands/Events */ diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 35903b282885..8069141ac105 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -4644,6 +4644,42 @@ ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, return ice_status_to_errno(status); }
+/** + * ice_aq_get_cgu_dpll_status + * @hw: pointer to the HW struct + * @dpll_num: DPLL index + * @ref_state: Reference clock state + * @dpll_state: DPLL state + * @phase_offset: Phase offset in ps + * @eec_mode: EEC_mode + * + * Get CGU DPLL status (0x0C66) + */ +enum ice_status +ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, + u16 *dpll_state, u64 *phase_offset, u8 *eec_mode) +{ + struct ice_aqc_get_cgu_dpll_status *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_cgu_dpll_status); + cmd = &desc.params.get_cgu_dpll_status; + cmd->dpll_num = dpll_num; + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) { + *ref_state = cmd->ref_state; + *dpll_state = le16_to_cpu(cmd->dpll_state); + *phase_offset = le32_to_cpu(cmd->phase_offset_h); + *phase_offset <<= 32; + *phase_offset += le32_to_cpu(cmd->phase_offset_l); + *eec_mode = cmd->eec_mode; + } + + return status; +} + /** * ice_replay_pre_init - replay pre initialization * @hw: pointer to the HW struct diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index b20a5c085246..aaed388a40a8 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -106,6 +106,7 @@ enum ice_status ice_aq_manage_mac_write(struct ice_hw *hw, const u8 *mac_addr, u8 flags, struct ice_sq_cd *cd); bool ice_is_e810(struct ice_hw *hw); +bool ice_is_e810t(struct ice_hw *hw); enum ice_status ice_clear_pf_cfg(struct ice_hw *hw); enum ice_status ice_aq_set_phy_cfg(struct ice_hw *hw, struct ice_port_info *pi, @@ -162,6 +163,9 @@ ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap, int ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc, u16 *rdma_qset, u16 num_qsets, u32 *qset_teid); +enum ice_status +ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, + u16 *dpll_state, u64 *phase_offset, u8 *eec_mode); int ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, u16 *q_id); @@ -189,7 +193,6 @@ ice_stat_update40(struct ice_hw *hw, u32 reg, bool prev_stat_loaded, void ice_stat_update32(struct ice_hw *hw, u32 reg, bool prev_stat_loaded, u64 *prev_stat, u64 *cur_stat); -bool ice_is_e810t(struct ice_hw *hw); enum ice_status ice_sched_query_elem(struct ice_hw *hw, u32 node_teid, struct ice_aqc_txsched_elem_data *buf); diff --git a/drivers/net/ethernet/intel/ice/ice_devids.h b/drivers/net/ethernet/intel/ice/ice_devids.h index 61dd2f18dee8..0b654d417d29 100644 --- a/drivers/net/ethernet/intel/ice/ice_devids.h +++ b/drivers/net/ethernet/intel/ice/ice_devids.h @@ -58,4 +58,7 @@ /* Intel(R) Ethernet Connection E822-L 1GbE */ #define ICE_DEV_ID_E822L_SGMII 0x189A
+#define ICE_SUBDEV_ID_E810T 0x000E +#define ICE_SUBDEV_ID_E810T2 0x000F + #endif /* _ICE_DEVIDS_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index f099797f35e3..7fac27903ab4 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6240,6 +6240,50 @@ static void ice_napi_disable_all(struct ice_vsi *vsi) } }
+/** + * ice_get_eec_state - get state of SyncE DPLL + * @netdev: network interface device structure + * @state: state of SyncE DPLL + * @extack: netlink extended ack + */ +static int +ice_get_eec_state(struct net_device *netdev, enum if_eec_state *state, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *state = pf->synce_dpll_state; + + return 0; +} + +/** + * ice_get_eec_src - get reference index of SyncE DPLL + * @netdev: network interface device structure + * @src: index of source reference of the SyncE DPLL + * @extack: netlink extended ack + */ +static int +ice_get_eec_src(struct net_device *netdev, u32 *src, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *src = pf->synce_dpll_pin; + + return 0; +} + /** * ice_down - Shutdown the connection * @vsi: The VSI being stopped @@ -8601,4 +8645,6 @@ static const struct net_device_ops ice_netdev_ops = { .ndo_bpf = ice_xdp, .ndo_xdp_xmit = ice_xdp_xmit, .ndo_xsk_wakeup = ice_xsk_wakeup, + .ndo_get_eec_state = ice_get_eec_state, + .ndo_get_eec_src = ice_get_eec_src, }; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index bf7247c6f58e..a38d0ab4d6d5 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -1766,6 +1766,36 @@ static void ice_ptp_tx_tstamp_cleanup(struct ice_ptp_tx *tx) } }
+static void ice_handle_cgu_state(struct ice_pf *pf) +{ + enum if_eec_state cgu_state; + u8 pin; + + cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_SYNCE, &pin); + if (pf->synce_dpll_state != cgu_state) { + pf->synce_dpll_state = cgu_state; + pf->synce_dpll_pin = pin; + + dev_warn(ice_pf_to_dev(pf), + "<DPLL%i> state changed to: %d, pin %d", + ICE_CGU_DPLL_SYNCE, + pf->synce_dpll_state, + pin); + } + + cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_PTP, &pin); + if (pf->ptp_dpll_state != cgu_state) { + pf->ptp_dpll_state = cgu_state; + pf->ptp_dpll_pin = pin; + + dev_warn(ice_pf_to_dev(pf), + "<DPLL%i> state changed to: %d, pin %d", + ICE_CGU_DPLL_PTP, + pf->ptp_dpll_state, + pin); + } +} + static void ice_ptp_periodic_work(struct kthread_work *work) { struct ice_ptp *ptp = container_of(work, struct ice_ptp, work.work); @@ -1774,6 +1804,9 @@ static void ice_ptp_periodic_work(struct kthread_work *work) if (!test_bit(ICE_FLAG_PTP, pf->flags)) return;
+ if (ice_is_feature_supported(pf, ICE_F_CGU)) + ice_handle_cgu_state(pf); + ice_ptp_update_cached_phctime(pf);
ice_ptp_tx_tstamp_cleanup(&pf->ptp.port.tx); @@ -1958,3 +1991,4 @@ void ice_ptp_release(struct ice_pf *pf)
dev_info(ice_pf_to_dev(pf), "Removed PTP clock\n"); } + diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index aa257db36765..7a9482918a20 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -375,6 +375,54 @@ static int ice_ptp_port_cmd_e810(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd) return 0; }
+/** + * ice_get_zl_dpll_state - get the state of the DPLL + * @hw: pointer to the hw struct + * @dpll_idx: Index of internal DPLL unit + * @pin: pointer to a buffer for returning currently active pin + * + * This function will read the state of the DPLL(dpll_idx). If optional + * parameter pin is given it'll be used to retrieve currently active pin. + * + * Return: state of the DPLL + */ +enum if_eec_state +ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin) +{ + enum ice_status status; + u64 phase_offset; + u16 dpll_state; + u8 ref_state; + u8 eec_mode; + + if (dpll_idx >= ICE_CGU_DPLL_MAX) + return IF_EEC_STATE_INVALID; + + status = ice_aq_get_cgu_dpll_status(hw, dpll_idx, &ref_state, + &dpll_state, &phase_offset, + &eec_mode); + if (status) + return IF_EEC_STATE_INVALID; + + if (pin) { + /* current ref pin in dpll_state_refsel_status_X register */ + *pin = (dpll_state & + ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL) >> + ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT; + } + + if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK) { + if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY) + return IF_EEC_STATE_LOCKED_HO_ACQ; + else + return IF_EEC_STATE_LOCKED; + } else if ((dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO) && + (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY)) { + return IF_EEC_STATE_HOLDOVER; + } + return IF_EEC_STATE_FREERUN; +} + /* Device agnostic functions * * The following functions implement useful behavior to hide the differences diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h index b2984b5c22c1..fcd543531b2c 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h @@ -33,6 +33,8 @@ int ice_ptp_init_phy_e810(struct ice_hw *hw); int ice_read_sma_ctrl_e810t(struct ice_hw *hw, u8 *data); int ice_write_sma_ctrl_e810t(struct ice_hw *hw, u8 data); bool ice_is_pca9575_present(struct ice_hw *hw); +enum if_eec_state +ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin);
#define PFTSYN_SEM_BYTES 4
@@ -98,4 +100,24 @@ bool ice_is_pca9575_present(struct ice_hw *hw); #define ICE_SMA_MAX_BIT_E810T 7 #define ICE_PCA9575_P1_OFFSET 8
+enum ice_e810t_cgu_dpll { + ICE_CGU_DPLL_SYNCE, + ICE_CGU_DPLL_PTP, + ICE_CGU_DPLL_MAX +}; + +enum ice_e810t_cgu_pins { + REF0P, + REF0N, + REF1P, + REF1N, + REF2P, + REF2N, + REF3P, + REF3N, + REF4P, + REF4N, + NUM_E810T_CGU_PINS +}; + #endif /* _ICE_PTP_HW_H_ */
Add support for RTNL messages for reading/configuring SyncE recovered clocks. The messages are:
RTM_GETRCLKSTATE: Read the state of recovered pins that output recovered clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- include/linux/netdevice.h | 9 +++ include/uapi/linux/if_link.h | 22 +++++++ include/uapi/linux/rtnetlink.h | 13 ++-- net/core/rtnetlink.c | 110 +++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 2 + 5 files changed, 152 insertions(+), 4 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index ef2b381dae0c..708bd8336155 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1576,6 +1576,15 @@ struct net_device_ops { int (*ndo_get_eec_src)(struct net_device *dev, u32 *src, struct netlink_ext_ack *extack); + int (*ndo_get_rclk_range)(struct net_device *dev, + u32 *min_idx, u32 *max_idx, + struct netlink_ext_ack *extack); + int (*ndo_set_rclk_out)(struct net_device *dev, + u32 out_idx, bool ena, + struct netlink_ext_ack *extack); + int (*ndo_get_rclk_state)(struct net_device *dev, + u32 out_idx, bool *ena, + struct netlink_ext_ack *extack); };
/** diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 3628a55fdd10..8a708cbd3c6d 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1300,4 +1300,26 @@ enum {
#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1)
+struct if_set_rclk_msg { + __u32 ifindex; + __u32 out_idx; + __u32 flags; +}; + +#define SET_RCLK_FLAGS_ENA (1U << 0) + +enum { + IFLA_RCLK_STATE_UNSPEC, + IFLA_RCLK_STATE_OUT_STATE, + IFLA_RCLK_STATE_COUNT, + __IFLA_RCLK_STATE_MAX, +}; + +struct if_get_rclk_msg { + __u32 out_idx; + __u32 flags; +}; + +#define GET_RCLK_FLAGS_ENA (1U << 0) + #endif /* _UAPI_LINUX_IF_LINK_H */ diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 1d8662afd6bd..b02fcbfc7b5e 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,11 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
+ RTM_GETRCLKSTATE = 120, +#define RTM_GETRCLKSTATE RTM_GETRCLKSTATE + RTM_SETRCLKSTATE = 121, +#define RTM_SETRCLKSTATE RTM_SETRCLKSTATE + RTM_GETEECSTATE = 124, #define RTM_GETEECSTATE RTM_GETEECSTATE
@@ -196,7 +201,7 @@ enum { #define RTM_NR_FAMILIES (RTM_NR_MSGTYPES >> 2) #define RTM_FAM(cmd) (((cmd) - RTM_BASE) >> 2)
-/* +/* Generic structure for encapsulation of optional route information. It is reminiscent of sockaddr, but with sa_family replaced with attribute type. @@ -236,7 +241,7 @@ struct rtmsg {
unsigned char rtm_table; /* Routing table id */ unsigned char rtm_protocol; /* Routing protocol; see below */ - unsigned char rtm_scope; /* See below */ + unsigned char rtm_scope; /* See below */ unsigned char rtm_type; /* See below */
unsigned rtm_flags; @@ -558,7 +563,7 @@ struct ifinfomsg { };
/******************************************************************** - * prefix information + * prefix information ****/
struct prefixmsg { @@ -572,7 +577,7 @@ struct prefixmsg { unsigned char prefix_pad3; };
-enum +enum { PREFIX_UNSPEC, PREFIX_ADDRESS, diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 03bc773d0e69..5d69cbb7fc50 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5544,6 +5544,113 @@ static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, return err; }
+static int rtnl_fill_rclk_state(struct sk_buff *skb, struct net_device *dev, + u32 portid, u32 seq, + struct netlink_callback *cb, int flags, + struct netlink_ext_ack *extack) +{ + const struct net_device_ops *ops = dev->netdev_ops; + u32 min_idx, max_idx, src_idx, count = 0; + struct if_eec_state_msg *state_msg; + struct nlmsghdr *nlh; + bool ena; + int err; + + ASSERT_RTNL(); + + if (!ops->ndo_get_rclk_state || !ops->ndo_get_rclk_range) + return -EOPNOTSUPP; + + err = ops->ndo_get_rclk_range(dev, &min_idx, &max_idx, extack); + if (err) + return err; + + nlh = nlmsg_put(skb, portid, seq, RTM_GETRCLKSTATE, sizeof(*state_msg), + flags); + if (!nlh) + return -EMSGSIZE; + + state_msg = nlmsg_data(nlh); + state_msg->ifindex = dev->ifindex; + + for (src_idx = min_idx; src_idx <= max_idx; src_idx++) { + struct if_get_rclk_msg rclk_state; + + ops->ndo_get_rclk_state(dev, src_idx, &ena, extack); + + rclk_state.out_idx = src_idx; + rclk_state.flags = ena ? GET_RCLK_FLAGS_ENA : 0; + + if (nla_put(skb, IFLA_RCLK_STATE_OUT_STATE, sizeof(rclk_state), + &rclk_state)) + return -EMSGSIZE; + count++; + } + + if (nla_put_u32(skb, IFLA_RCLK_STATE_COUNT, count)) + return -EMSGSIZE; + + nlmsg_end(skb, nlh); + return 0; +} + +static int rtnl_rclk_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_eec_state_msg *state; + struct net_device *dev; + struct sk_buff *nskb; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!nskb) + return -ENOBUFS; + + err = rtnl_fill_rclk_state(nskb, dev, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NULL, nlh->nlmsg_flags, + extack); + if (err < 0) + kfree_skb(nskb); + else + err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); + + return err; +} + +static int rtnl_rclk_set(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_set_rclk_msg *state; + struct net_device *dev; + bool ena; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + if (!dev->netdev_ops->ndo_set_rclk_out) + return -EOPNOTSUPP; + + ena = !!(state->flags & SET_RCLK_FLAGS_ENA); + err = dev->netdev_ops->ndo_set_rclk_out(dev, state->out_idx, ena, + extack); + + return err; +} + /* Process one rtnetlink message. */
static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5770,5 +5877,8 @@ void __init rtnetlink_init(void) rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0);
+ rtnl_register(PF_UNSPEC, RTM_GETRCLKSTATE, rtnl_rclk_state_get, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_SETRCLKSTATE, rtnl_rclk_set, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0); } diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 2c66e722ea9c..1899c86694ff 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -91,6 +91,8 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_NEWNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_GETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_SETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETEECSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, };
Implement NDO functions for handling SyncE recovered clocks.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- .../net/ethernet/intel/ice/ice_adminq_cmd.h | 53 +++++++++++ drivers/net/ethernet/intel/ice/ice_common.c | 65 +++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 6 ++ drivers/net/ethernet/intel/ice/ice_main.c | 91 +++++++++++++++++++ include/linux/netdevice.h | 11 +++ 5 files changed, 226 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 11226af7a9a4..dace00a35c44 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1281,6 +1281,31 @@ struct ice_aqc_set_mac_lb { u8 reserved[15]; };
+/* Set PHY recovered clock output (direct 0x0630) */ +struct ice_aqc_set_phy_rec_clk_out { + u8 phy_output; + u8 port_num; + u8 flags; +#define ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN BIT(0) +#define ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT 0xFF + u8 rsvd; + __le32 freq; + u8 rsvd2[6]; + __le16 node_handle; +}; + +/* Get PHY recovered clock output (direct 0x0631) */ +struct ice_aqc_get_phy_rec_clk_out { + u8 phy_output; + u8 port_num; + u8 flags; +#define ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN BIT(0) + u8 rsvd; + __le32 freq; + u8 rsvd2[6]; + __le16 node_handle; +}; + struct ice_aqc_link_topo_params { u8 lport_num; u8 lport_num_valid; @@ -1838,6 +1863,28 @@ struct ice_aqc_get_cgu_dpll_status { __le16 node_handle; };
+/* Read CGU register (direct 0x0C6E) */ +struct ice_aqc_read_cgu_reg { + __le16 offset; +#define ICE_AQC_READ_CGU_REG_MAX_DATA_LEN 16 + u8 data_len; + u8 rsvd[13]; +}; + +/* Read CGU register response (direct 0x0C6E) */ +struct ice_aqc_read_cgu_reg_resp { + u8 data[ICE_AQC_READ_CGU_REG_MAX_DATA_LEN]; +}; + +/* Write CGU register (direct 0x0C6F) */ +struct ice_aqc_write_cgu_reg { + __le16 offset; +#define ICE_AQC_WRITE_CGU_REG_MAX_DATA_LEN 7 + u8 data_len; + u8 data[ICE_AQC_WRITE_CGU_REG_MAX_DATA_LEN]; + u8 rsvd[6]; +}; + /* Configure Firmware Logging Command (indirect 0xFF09) * Logging Information Read Response (indirect 0xFF10) * Note: The 0xFF10 command has no input parameters. @@ -2033,6 +2080,8 @@ struct ice_aq_desc { struct ice_aqc_get_phy_caps get_phy; struct ice_aqc_set_phy_cfg set_phy; struct ice_aqc_restart_an restart_an; + struct ice_aqc_set_phy_rec_clk_out set_phy_rec_clk_out; + struct ice_aqc_get_phy_rec_clk_out get_phy_rec_clk_out; struct ice_aqc_gpio read_write_gpio; struct ice_aqc_sff_eeprom read_write_sff_param; struct ice_aqc_set_port_id_led set_port_id_led; @@ -2188,6 +2237,8 @@ enum ice_adminq_opc { ice_aqc_opc_get_link_status = 0x0607, ice_aqc_opc_set_event_mask = 0x0613, ice_aqc_opc_set_mac_lb = 0x0620, + ice_aqc_opc_set_phy_rec_clk_out = 0x0630, + ice_aqc_opc_get_phy_rec_clk_out = 0x0631, ice_aqc_opc_get_link_topo = 0x06E0, ice_aqc_opc_set_port_id_led = 0x06E9, ice_aqc_opc_set_gpio = 0x06EC, @@ -2238,6 +2289,8 @@ enum ice_adminq_opc {
/* 1588/SyncE commands/events */ ice_aqc_opc_get_cgu_dpll_status = 0x0C66, + ice_aqc_opc_read_cgu_reg = 0x0C6E, + ice_aqc_opc_write_cgu_reg = 0x0C6F,
ice_aqc_opc_driver_shared_params = 0x0C90,
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 8069141ac105..29d302ea1e56 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -5242,3 +5242,68 @@ bool ice_is_clock_mux_present_e810t(struct ice_hw *hw) return true; }
+/** + * ice_aq_set_phy_rec_clk_out - set RCLK phy out + * @hw: pointer to the HW struct + * @phy_output: PHY reference clock output pin + * @enable: GPIO state to be applied + * @freq: PHY output frequency + * + * Set CGU reference priority (0x0630) + * Return 0 on success or negative value on failure. + */ +enum ice_status +ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable, + u32 *freq) +{ + struct ice_aqc_set_phy_rec_clk_out *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_phy_rec_clk_out); + cmd = &desc.params.set_phy_rec_clk_out; + cmd->phy_output = phy_output; + cmd->port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT; + cmd->flags = enable & ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN; + cmd->freq = cpu_to_le32(*freq); + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) + *freq = le32_to_cpu(cmd->freq); + + return status; +} + +/** + * ice_aq_get_phy_rec_clk_out + * @hw: pointer to the HW struct + * @phy_output: PHY reference clock output pin + * @port_num: Port number + * @flags: PHY flags + * @freq: PHY output frequency + * + * Get PHY recovered clock output (0x0631) + */ +enum ice_status +ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num, + u8 *flags, u32 *freq) +{ + struct ice_aqc_get_phy_rec_clk_out *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_phy_rec_clk_out); + cmd = &desc.params.get_phy_rec_clk_out; + cmd->phy_output = phy_output; + cmd->port_num = *port_num; + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) { + *port_num = cmd->port_num; + *flags = cmd->flags; + *freq = le32_to_cpu(cmd->freq); + } + + return status; +} + diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index aaed388a40a8..8a99c8364173 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -166,6 +166,12 @@ ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc, enum ice_status ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, u16 *dpll_state, u64 *phase_offset, u8 *eec_mode); +enum ice_status +ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable, + u32 *freq); +enum ice_status +ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num, + u8 *flags, u32 *freq); int ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, u16 *q_id); diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 7fac27903ab4..98834aa3f3dc 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6284,6 +6284,94 @@ ice_get_eec_src(struct net_device *netdev, u32 *src, return 0; }
+/** + * ice_get_rclk_range - get range of recovered clock indices + * @netdev: network interface device structure + * @min_idx: min rclk index + * @max_idx: max rclk index + * @extack: netlink extended ack + */ +static int +ice_get_rclk_range(struct net_device *netdev, u32 *min_idx, u32 *max_idx, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *min_idx = REF1P; + *max_idx = REF1N; + + return 0; +} + +/** + * ice_set_rclk_out - set recovered clock redirection to the output pin + * @netdev: network interface device structure + * @out_idx: output index + * @ena: true will enable redirection, false will disable it + * @extack: netlink extended ack + */ +static int +ice_set_rclk_out(struct net_device *netdev, u32 out_idx, bool ena, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + enum ice_status ret; + u32 freq; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + if (out_idx < REF1P || out_idx > REF1N) + return -EINVAL; + + ret = ice_aq_set_phy_rec_clk_out(&pf->hw, out_idx - REF1P, ena, &freq); + + return ice_status_to_errno(ret); +} + +/** + * ice_get_rclk_state - Get state of recovered clock pin for a given netdev + * @netdev: network interface device structure + * @out_idx: output index + * @ena: returns true if the pin is enabled + * @extack: netlink extended ack + */ +static int +ice_get_rclk_state(struct net_device *netdev, u32 out_idx, bool *ena, + struct netlink_ext_ack *extack) +{ + u8 port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT; + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + enum ice_status ret; + u32 freq; + u8 flags; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + if (out_idx < REF1P || out_idx > REF1N) + return -EINVAL; + + ret = ice_aq_get_phy_rec_clk_out(&pf->hw, out_idx - REF1P, &port_num, + &flags, &freq); + + if (!ret && (flags & ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN)) + *ena = true; + else + *ena = false; + + return ice_status_to_errno(ret); +} + /** * ice_down - Shutdown the connection * @vsi: The VSI being stopped @@ -8647,4 +8735,7 @@ static const struct net_device_ops ice_netdev_ops = { .ndo_xsk_wakeup = ice_xsk_wakeup, .ndo_get_eec_state = ice_get_eec_state, .ndo_get_eec_src = ice_get_eec_src, + .ndo_get_rclk_range = ice_get_rclk_range, + .ndo_set_rclk_out = ice_set_rclk_out, + .ndo_get_rclk_state = ice_get_rclk_state, }; diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 708bd8336155..9faa005506d1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1351,6 +1351,17 @@ struct netdev_net_notifier { * struct netlink_ext_ack *extack); * Get the index of the source signal that's currently used as EEC's * reference + * int (*ndo_get_rclk_range)(struct net_device *dev, u32 *min_idx, u32 *max_idx, + * struct netlink_ext_ack *extack); + * Get range of valid output indices for the set/get Recovered Clock + * functions + * int (*ndo_set_rclk_out)(struct net_device *dev, u32 out_idx, bool ena, + * struct netlink_ext_ack *extack); + * Set the receive clock recovery redirection to a given Recovered Clock + * output. + * int (*ndo_get_rclk_state)(struct net_device *dev, u32 out_idx, bool *ena, + * struct netlink_ext_ack *extack); + * Get current state of the recovered clock to pin mapping. */ struct net_device_ops { int (*ndo_init)(struct net_device *dev);
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet).
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- Documentation/networking/synce.rst | 124 +++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 Documentation/networking/synce.rst
diff --git a/Documentation/networking/synce.rst b/Documentation/networking/synce.rst new file mode 100644 index 000000000000..a7bb75685c07 --- /dev/null +++ b/Documentation/networking/synce.rst @@ -0,0 +1,124 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================= +Synchronous Equipment Clocks +============================= + +Synchronous Equipment Clocks use a physical layer clock to syntonize +the frequency across different network elements. + +Basic Synchronous network node consist of a Synchronous Equipment +Clock (SEC) and and a PHY that has dedicated outputs of clocks recovered +from the Receive side and a dedicated TX clock input that is used as +a reference for the physical frequency of the transmit data to other nodes. + +The PHY is able to recover the physical signal frequency of the RX data +stream on RX ports and redirect it (sometimes dividing it) to recovered +clock outputs. Number of recovered clock output pins is usually lower than +the number of RX portx. As a result the RX port to Recovered Clock output +mapping needs to be configured. the TX frequency is directly depends on the +input frequency - either on the PHY CLK input, or on a dedicated +TX clock input. + + ┌──────────┬──────────┐ + │ RX │ TX │ + 1 │ ports │ ports │ 1 + ───►├─────┐ │ ├─────► + 2 │ │ │ │ 2 + ───►├───┐ │ │ ├─────► + 3 │ │ │ │ │ 3 + ───►├─┐ │ │ │ ├─────► + │ ▼ ▼ ▼ │ │ + │ ────── │ │ + │ ____/ │ │ + └──┼──┼────┴──────────┘ + 1│ 2│ ▲ + RCLK out│ │ │ TX CLK in + ▼ ▼ │ + ┌─────────────┴───┐ + │ │ + │ SEC │ + │ │ + └─────────────────┘ + +The SEC can synchronize its frequency to one of the synchronization inputs +either clocks recovered on traffic interfaces or (in advanced deployments) +external frequency sources. + +Some SEC implementations can automatically select synchronization source +through priority tables and synchronization status messaging and provide +necessary filtering and holdover capabilities. + +The following interface can be applicable to diffferent packet network types +following ITU-T G.8261/G.8262 recommendations. + +Interface +========= + +The following RTNL messages are used to read/configure SyncE recovered +clocks. + +RTM_GETRCLKSTATE +----------------- +Read the state of recovered pins that output recovered clock from +a given port. The message will contain the number of assigned clocks +(IFLA_RCLK_STATE_COUNT) and an N pin indexes in IFLA_RCLK_STATE_OUT_STATE +To support multiple recovered clock outputs from the same port, this message +will return the IFLA_RCLK_STATE_COUNT attribute containing the number of +recovered clock outputs (N) and N IFLA_RCLK_STATE_OUT_STATE attributes +listing the output indexes with the respective GET_RCLK_FLAGS_ENA flag. +This message will call the ndo_get_rclk_range to determine the allowed +recovered clock indexes and then will loop through them, calling +the ndo_get_rclk_state for each of them. + + +Attributes: +IFLA_RCLK_STATE_COUNT - Returns the number of recovered clock outputs +IFLA_RCLK_STATE_OUT_STATE - Returns the current state of a single recovered + clock output in the struct if_get_rclk_msg. +struct if_get_rclk_msg { + __u32 out_idx; /* output index (from a valid range) */ + __u32 flags; /* configuration flags */ +}; + +Currently supported flags: +#define GET_RCLK_FLAGS_ENA (1U << 0) + + +RTM_SETRCLKSTATE +----------------- +Sets the redirection of the recovered clock for a given pin. This message +expects one attribute: +struct if_set_rclk_msg { + __u32 ifindex; /* interface index */ + __u32 out_idx; /* output index (from a valid range) */ + __u32 flags; /* configuration flags */ +}; + +Supported flags are: +SET_RCLK_FLAGS_ENA - if set in flags - the given output will be enabled, + if clear - the output will be disabled. + +RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. + The states returned in this attribute are aligned to the + ITU-T G.781 and are: + IF_EEC_STATE_INVALID - state is not valid + IF_EEC_STATE_FREERUN - clock is free-running + IF_EEC_STATE_LOCKED - clock is locked to the reference, + but the holdover memory is not valid + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the reference + and holdover memory is valid + IF_EEC_STATE_HOLDOVER - clock is in holdover mode +State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, + u32 *src_idx, struct netlink_ext_ack *extack); + +IFLA_EEC_SRC_IDX - optional attribute returning the index of the reference + that is used for the current IFLA_EEC_STATE, i.e., + the index of the pin that the EEC is locked to. + +Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
Maciej Machnikowski maciej.machnikowski@intel.com writes:
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet).
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
Documentation/networking/synce.rst | 124 +++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) create mode 100644 Documentation/networking/synce.rst
diff --git a/Documentation/networking/synce.rst b/Documentation/networking/synce.rst new file mode 100644 index 000000000000..a7bb75685c07 --- /dev/null +++ b/Documentation/networking/synce.rst @@ -0,0 +1,124 @@ +.. SPDX-License-Identifier: GPL-2.0
+============================= +Synchronous Equipment Clocks +=============================
+Synchronous Equipment Clocks use a physical layer clock to syntonize +the frequency across different network elements.
+Basic Synchronous network node consist of a Synchronous Equipment +Clock (SEC) and and a PHY that has dedicated outputs of clocks recovered +from the Receive side and a dedicated TX clock input that is used as +a reference for the physical frequency of the transmit data to other nodes.
+The PHY is able to recover the physical signal frequency of the RX data +stream on RX ports and redirect it (sometimes dividing it) to recovered +clock outputs. Number of recovered clock output pins is usually lower than +the number of RX portx. As a result the RX port to Recovered Clock output +mapping needs to be configured. the TX frequency is directly depends on the +input frequency - either on the PHY CLK input, or on a dedicated +TX clock input.
┌──────────┬──────────┐
│ RX │ TX │
- 1 │ ports │ ports │ 1
- ───►├─────┐ │ ├─────►
- 2 │ │ │ │ 2
- ───►├───┐ │ │ ├─────►
- 3 │ │ │ │ │ 3
- ───►├─┐ │ │ │ ├─────►
│ ▼ ▼ ▼ │ │
│ ────── │ │
│ \____/ │ │
└──┼──┼────┴──────────┘
1│ 2│ ▲
- RCLK out│ │ │ TX CLK in
▼ ▼ │
┌─────────────┴───┐
│ │
│ SEC │
│ │
└─────────────────┘
+The SEC can synchronize its frequency to one of the synchronization inputs +either clocks recovered on traffic interfaces or (in advanced deployments) +external frequency sources.
+Some SEC implementations can automatically select synchronization source +through priority tables and synchronization status messaging and provide +necessary filtering and holdover capabilities.
+The following interface can be applicable to diffferent packet network types +following ITU-T G.8261/G.8262 recommendations.
+Interface +=========
+The following RTNL messages are used to read/configure SyncE recovered +clocks.
+RTM_GETRCLKSTATE +----------------- +Read the state of recovered pins that output recovered clock from +a given port. The message will contain the number of assigned clocks +(IFLA_RCLK_STATE_COUNT) and an N pin indexes in IFLA_RCLK_STATE_OUT_STATE +To support multiple recovered clock outputs from the same port, this message +will return the IFLA_RCLK_STATE_COUNT attribute containing the number of +recovered clock outputs (N) and N IFLA_RCLK_STATE_OUT_STATE attributes +listing the output indexes with the respective GET_RCLK_FLAGS_ENA flag. +This message will call the ndo_get_rclk_range to determine the allowed +recovered clock indexes and then will loop through them, calling +the ndo_get_rclk_state for each of them.
+Attributes: +IFLA_RCLK_STATE_COUNT - Returns the number of recovered clock outputs +IFLA_RCLK_STATE_OUT_STATE - Returns the current state of a single recovered
clock output in the struct if_get_rclk_msg.
+struct if_get_rclk_msg {
- __u32 out_idx; /* output index (from a valid range) */
- __u32 flags; /* configuration flags */
+};
+Currently supported flags: +#define GET_RCLK_FLAGS_ENA (1U << 0)
+RTM_SETRCLKSTATE +----------------- +Sets the redirection of the recovered clock for a given pin. This message +expects one attribute: +struct if_set_rclk_msg {
- __u32 ifindex; /* interface index */
- __u32 out_idx; /* output index (from a valid range) */
- __u32 flags; /* configuration flags */
+};
+Supported flags are: +SET_RCLK_FLAGS_ENA - if set in flags - the given output will be enabled,
if clear - the output will be disabled.
+RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator.
The states returned in this attribute are aligned to the
ITU-T G.781 and are:
IF_EEC_STATE_INVALID - state is not valid
IF_EEC_STATE_FREERUN - clock is free-running
IF_EEC_STATE_LOCKED - clock is locked to the reference,
but the holdover memory is not valid
IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the reference
and holdover memory is valid
IF_EEC_STATE_HOLDOVER - clock is in holdover mode
+State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state,
u32 *src_idx, struct netlink_ext_ack *extack);
+IFLA_EEC_SRC_IDX - optional attribute returning the index of the reference
that is used for the current IFLA_EEC_STATE, i.e.,
the index of the pin that the EEC is locked to.
+Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
Just to be clear, I have much the same objections to this UAPI as I had to v2:
- RTM_GETEECSTATE will become obsolete as soon as DPLL object is added.
- Reporting pins through the netdevices that use them allows for configurations that are likely invalid, like disjoint "frequency bridges".
- It's not clear what enabling several pins means, and it's not clear whether this genericity is not going to be an issue in the future when we know what enabling more pins means.
- No way as a user to tell whether two interfaces that report the same pins are actually connected to the same EEC. How many EEC's are there, in the system, anyway?
In particular, I think that the proposed UAPIs should belong to a DPLL object. That object must know about the pins, so have it enumerate them. That object needs to know about which pin/s to track, so configure it there. That object has the state, so have it report it. Really, it looks basically 1:1 vs. the proposed API, except the object over which the UAPIs should be defined is a DPLL, not a netdev.
-----Original Message----- From: Petr Machata petrm@nvidia.com Sent: Thursday, November 11, 2021 1:43 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE interfaces
Maciej Machnikowski maciej.machnikowski@intel.com writes:
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet).
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
...
+RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator.
The states returned in this attribute are aligned to the
ITU-T G.781 and are:
IF_EEC_STATE_INVALID - state is not valid
IF_EEC_STATE_FREERUN - clock is free-running
IF_EEC_STATE_LOCKED - clock is locked to the reference,
but the holdover memory is not valid
IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the
reference
and holdover memory is valid
IF_EEC_STATE_HOLDOVER - clock is in holdover mode
+State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state
*state,
u32 *src_idx, struct netlink_ext_ack *extack);
+IFLA_EEC_SRC_IDX - optional attribute returning the index of the
reference
that is used for the current IFLA_EEC_STATE, i.e.,
the index of the pin that the EEC is locked to.
+Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
Just to be clear, I have much the same objections to this UAPI as I had to v2:
- RTM_GETEECSTATE will become obsolete as soon as DPLL object is added.
Yes for more complex devices and no for simple ones
- Reporting pins through the netdevices that use them allows for configurations that are likely invalid, like disjoint "frequency bridges".
Not sure if I understand that comment. In target application a given netdev will receive an ESMC message containing the quality of the clock that it has on the receive side. The upper layer software will see QL_PRC on one port and QL_EEC on other and will need to enable clock output from the port that received QL_PRC, as it's the higher clock class. Once the EEC reports Locked state all other ports that are traceable to a given EEC (either using the DPLL subsystem, or the config file) will start reporting QL_PRC to downstream devices.
- It's not clear what enabling several pins means, and it's not clear whether this genericity is not going to be an issue in the future when we know what enabling more pins means.
It means that the recovered frequency will appear on 2 or more physical pins of the package.
- No way as a user to tell whether two interfaces that report the same pins are actually connected to the same EEC. How many EEC's are there, in the system, anyway?
For now we can fix that with the config file, for future we will be able to trace them to the same EEC. It's like BC in PTP - you can rely on automated mode, but can also override it with the config file.
In particular, I think that the proposed UAPIs should belong to a DPLL object. That object must know about the pins, so have it enumerate them. That object needs to know about which pin/s to track, so configure it there. That object has the state, so have it report it. Really, it looks basically 1:1 vs. the proposed API, except the object over which the UAPIs should be defined is a DPLL, not a netdev.
RCLK pin API does not belong to the DPLL and never will. That part will always belong to the netdev.
Machnikowski, Maciej maciej.machnikowski@intel.com writes:
-----Original Message----- From: Petr Machata petrm@nvidia.com Sent: Thursday, November 11, 2021 1:43 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE interfaces
Maciej Machnikowski maciej.machnikowski@intel.com writes:
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet).
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
...
+RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator.
The states returned in this attribute are aligned to the
ITU-T G.781 and are:
IF_EEC_STATE_INVALID - state is not valid
IF_EEC_STATE_FREERUN - clock is free-running
IF_EEC_STATE_LOCKED - clock is locked to the reference,
but the holdover memory is not valid
IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the
reference
and holdover memory is valid
IF_EEC_STATE_HOLDOVER - clock is in holdover mode
+State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state
*state,
u32 *src_idx, struct netlink_ext_ack *extack);
+IFLA_EEC_SRC_IDX - optional attribute returning the index of the
reference
that is used for the current IFLA_EEC_STATE, i.e.,
the index of the pin that the EEC is locked to.
+Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
Just to be clear, I have much the same objections to this UAPI as I had to v2:
- RTM_GETEECSTATE will become obsolete as soon as DPLL object is added.
Yes for more complex devices and no for simple ones
If we have an interface suitable for more complex netdevices, the simpler ones can use it as well. Should in fact. There should not be two interfaces for the same thing. For reasons of maintenance, documentation, tool support, user experience.
Machnikowski, Maciej maciej.machnikowski@intel.com writes:
- Reporting pins through the netdevices that use them allows for configurations that are likely invalid, like disjoint "frequency bridges".
Not sure if I understand that comment. In target application a given netdev will receive an ESMC message containing the quality of the clock that it has on the receive side. The upper layer software will see QL_PRC on one port and QL_EEC on other and will need to enable clock output from the port that received QL_PRC, as it's the higher clock class. Once the EEC reports Locked state all other ports that are traceable to a given EEC (either using the DPLL subsystem, or the config file) will start reporting QL_PRC to downstream devices.
I think I had the reading of the UAPI wrong. So RTM_SETRCLKSTATE means, take the clock recovered from ifindex, and send it to pins that I have marked with the ENA flag.
But that still does not work well for multi-port devices. I can set it up to forward frequency from swp1 to swp2 and swp3, from swp4 to swp5 and swp6, etc. But in reality I only have one underlying DPLL and can't support this. So yeah, obviously, I bounce it in the driver. It also means that when I want to switch tracking from swp1 to swp2, I first need to unset all the swp1 pins (64 messages or whaveter) and then set it up at swp2 (64 more messages). As a user I still don't know which of my ports share DPLL. It's just not a great interface for multi-port devices.
Having this stuff at a dedicated DPLL object would make the issue go away completely. A driver then instantiates one DPLL, sets it up with RCLK pins and TX pins. The DPLL can be configured with which pin to take the frequency from, and which subset of pins to forward it to. There are as many DPLL objects as there are DPLL circuits in the system.
This works for simple port devices as well as switches, as well as non-networked devices.
The in-driver LOC overhead is a couple of _init / _fini calls and an ops structure that the DPLL subsystem uses to talk to the driver. Everything else remains the same.
- It's not clear what enabling several pins means, and it's not clear whether this genericity is not going to be an issue in the future when we know what enabling more pins means.
It means that the recovered frequency will appear on 2 or more physical pins of the package.
Yes, agreed now.
-----Original Message----- From: Petr Machata petrm@nvidia.com Sent: Tuesday, November 16, 2021 12:53 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH v3 net-next 6/6] docs: net: Add description of SyncE interfaces
Machnikowski, Maciej maciej.machnikowski@intel.com writes:
- Reporting pins through the netdevices that use them allows for configurations that are likely invalid, like disjoint "frequency bridges".
Not sure if I understand that comment. In target application a given netdev will receive an ESMC message containing the quality of the clock that it has on the receive side. The upper layer software will see QL_PRC on one port and QL_EEC on other and will need to enable clock output from the port that received QL_PRC, as it's the higher clock class. Once the EEC reports Locked state all other ports that are traceable to a given EEC (either using the DPLL subsystem, or the config file) will start reporting QL_PRC to downstream devices.
I think I had the reading of the UAPI wrong. So RTM_SETRCLKSTATE means, take the clock recovered from ifindex, and send it to pins that I have marked with the ENA flag.
But that still does not work well for multi-port devices. I can set it up to forward frequency from swp1 to swp2 and swp3, from swp4 to swp5 and swp6, etc. But in reality I only have one underlying DPLL and can't support this. So yeah, obviously, I bounce it in the driver. It also means that when I want to switch tracking from swp1 to swp2, I first need to unset all the swp1 pins (64 messages or whaveter) and then set it up at swp2 (64 more messages). As a user I still don't know which of my ports share DPLL. It's just not a great interface for multi-port devices.
This will only be done on init - after everything is configured - you will not really need to check anything there.
Having this stuff at a dedicated DPLL object would make the issue go away completely. A driver then instantiates one DPLL, sets it up with RCLK pins and TX pins. The DPLL can be configured with which pin to take the frequency from, and which subset of pins to forward it to. There are as many DPLL objects as there are DPLL circuits in the system.
This works for simple port devices as well as switches, as well as non-networked devices.
The in-driver LOC overhead is a couple of _init / _fini calls and an ops structure that the DPLL subsystem uses to talk to the driver. Everything else remains the same.
That won't work - a single recovered clock may be physically connected to more than one DPLL device and a single DPLL device may be used for more than one MAC chip at the same time - we shouldn't mix subsystems as recovered clocks belong to PHY/MAC layer.
Also in that case the DPLL would need to track the relation between all netdev ports upstream - which will be nightmare to keep track of when ports reset/get removed or added.
Also the netdev is the one that will receive the packet containing quality so the userspace app will know which netdev received it and not which DPLL pin it should configure. I think this approach will make everything more complex (unless I'm missing something).
- It's not clear what enabling several pins means, and it's not clear whether this genericity is not going to be an issue in the future when we know what enabling more pins means.
It means that the recovered frequency will appear on 2 or more physical pins of the package.
Yes, agreed now.
linux-kselftest-mirror@lists.linaro.org